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Gravitational lensing of the cosmic microwave background (CMB), a long-standing prediction of 
the standard cosmolgical model, is ultimately expected to be an important source of cosmological 
information, but first detection has not been achieved to date. We report a 3.4(t detection, by apply- 
ing quadratic estimator techniques to all sky maps from the Wilkinson Microwave Anisotropy Probe 
(WMAP) satellite, and correlating the result with radio galaxy counts from the NRAO VLA Sky 
Survey (NVSS). We present our methodology including a detailed discussion of potential contami- 
nants. Our error estimates include systematic uncertainties from density gradients in NVSS, beam 
effects in WMAP, Galactic microwave foregrounds, resolved and unresolved CMB point sources, and 
the thermal Sunyaev-Zeldovich effect. 



I. INTRODUCTION 

Within just two decades, cosmology has progressed 
from a rather speculative science to one of the most suc- 
cessful fields of physics, driven by an exemplary interplay 
between experiment and theory. Much of this progress 
has been owing to the well understood physics underly- 
ing the Comic Microwave Background (CMB) anisotropy, 
seeded by oscillations in the baryon-photon plasma of the 
early universe. 

Measurements of these fluctuations by a number of 
experiments have given rise to a basic cosmological 
paradigm, with the tightest current constraints on the 
cosmological parameter budget coming from combina- 
tions of data from the Wilkinson Microwave Anisotropy 
Probe (WMAP) satellite in conjunction with small 
scale CMB experiments (e.g. [1, 3, Q), and other rich 
probes of cosmological clustering and dynamics such 
as supernovae, galaxy surveys, the Lyman-alpha forest, 
weak lensing, and others (e.g. [^, 0, [E S El; [Hj [Hj 

El). 

The CMB promises to remain a gold mine for preci- 
sion cosmology, and two new frontiers lie ahead. First, 
a polarized component has recently been detected by a 
number of groups [H, El, [13, IH, IH, [111, offering e.g. 
the prospects of detecting primordial gravitational waves 
and constraining recombination physics. 

Second, large scale structure between the last scatter- 
ing surface and us alters the primary CMB anisotropy, 
through gravitational lensing (for a recent review of the 
theory see [Uj), through scattering off hot electrons 
in large scale structure (the Sunyaev-Zel'dovich effects) 
[2^ . , and through redshifting during the traverse of 
time-dependent potential fluctuations (the ISW effect) 
[2^ . A number of specialized instruments will soon begin 
to study details of these secondary anisotropics ^25, 2g]. 

As important as constraining cosmological and astro- 
physical parameters, detecting any of these effects is a 



crucial milestone for cosmological physics. The Sunyaev- 
Zel'dovich effect has been found by targeting clusters de- 
tected in X-ray [271, [28l [29l [SOj , also at high significance 
level using WMAP [31[ , and it has been observed in cross- 
correlation of galaxy surveys with WMAP [H, [H, [11] . 
The ISW effect has been detected in cross-correlation of 
WMAP with galax y s urve ys and with the hard X-ray 
background [11, [H, M [13, lH, [H, [H, [ill, [H [11 • 

A detection of gravitational lensing in the CMB has 
so far been outstanding. The main difficulty at millime- 
ter wavelengths is the high angular resolution needed, as 
typical deflection angles over a cosmological volume are 
only a few arcminutes. Non-Gaussianity imprinted by 
lensing into the primordial CMB may allow statistical 
detection with surveys at lower angular resolution, but 
the signal-to-noise is currently too low for internal detec- 
tions. Cross correlation with other tracers of large scale 
structure offers a way to limit systematics and increase 
the signal to noise. 

A first attempt [13] was made by cross correlating the 
WMAP first year release [l| with data from the Sloan 
Digital Sky Survey (SDSS) 3. These authors used a 
sample of 503,944 SDSS Luminous Red Galaxies (LRG's) 
overlapping with ~ 10% of the sky observed by WMAP. 
They were not able to find evidence for gravitational lens- 
ing within statistical error bounds. While SDSS LRG's 
have a well understood redshift distribution, their num- 
ber density drops rapidly beyond z = 0.5, and has only 
marginal overlap with the higher redshift range that is 
geometrically optimal for CMB lensing. Photometric 
quasars found in SDSS may offer an additional handle. 

Here we go a different route, using the 1.9 million ra- 
dio sources found in the NRAO VLA Sky Survey (NVSS) 
[isj . The large sky coverage and estimated depth of 
NVSS make it an excellent candidate for a search for 
CMB lensing in cross correlation with WMAP. The sur- 
vey covers 77% of the sky, 58% of which is found to over- 
lap with WMAP, once masks to limit systematics have 
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FIG. 1: Left panel: CMB signal power spectrum, and three- 
year WMAP noise power spectrum. Right panel: Fiducial 
NVSS signal power spectrum, and NVSS shot noise (G — 
159000 gal/steradian). 



tic. The former will be optimally handled by our estima- 
tor. Although the latter are well controlled for the power 
spectrum estimation [H, lU , they could potentially af- 
fect our lensing estimator as will be discussed below. We 
will show how the formalism presented in [46.] allows us 
to control them in our particular context too. Another 
source of systematic error might come from other as- 
trophysical sources, namely residual galactic foregrounds 
(synchrotron, free- free and dust), residual point sources 
and the signature of galaxy clusters via the Sunyaev- 
Zeldovich effect. These potential contaminants will be 
discussed in later sections. 



B. NVSS 



been applied. 

The structure of this paper is as follows. 

First, we describe the datasets f pl)) . theory and 
pipeline ( ijIV[) that will be used for detecting CMB lens- 
ing by reconstructing the lensing potential from WMAP, 
and cross-correlating the result to NVSS. The detection 
is shown, with statistical errors only, in Fig. \5\ The rest 
of the paper is devoted to null tests and assigning system- 
atic errors: NVSS systematics (S|V|, WMAP beam effects 
and Galactic foregrounds f WI|) . resolved and unresolved 
point sources f WII|) . and Sunyaev-Zeldovich fluctuations 
f WIIip . Wc quote our final result including systematic 
errors (Fig. [TO]) in mXl where we also mention future 
directions. 

In our calculations we will assume throughout the cos- 
mological model favored by a combination of WMAP, 
smaller scale CMB experiments, and other data (the 
WMAP-I-ALL analysis, 0): a local expansion rate Hq = 
70.4 km/s/Mpc, primordial power spectrum slope = 
0.947, matter and dark energy fractions of Hq = 0.267 
and SI A = 0.733 respectively, and amplitude erg — 0.773. 



II. DATASETS 
A. WMAP 

With the goal of producing full sky maps of the CMB 
with unprecedented accuracy, the WMAP satellite was 
launched in June 2001. Since then, it has been mapping 
the sky using 10 Differential Assemblies covering 5 fre- 
quency bands centered at 23 (K), 33 (Ka), 41 (Q), 61 
(V) and 94 GHz (W). In our analysis we use the 2 Q- 
band, 2 V-band, and 4 W-band temperature maps pro- 
duced using 3 year s of observations [46| and made pub- 
licly available [104j . We will use as a default mask the 
KpO mask, which cuts out the Galactic plane and point 
sources bright enough to be resolved by WMAP, leaving 
about 78.46% of the sky ^. 

The intrinsic quality of this dataset leaves us with few 
instrumental systematic effects to worry about [13, IH, 
|49| |. Nonetheless, noise inhomogeneities and beam ef- 
fects could be of particular concern for our lensing statis- 



As a tracer of the large scale density field, we use ob- 
servations resulting from the NRAO VLA Sky Survey 
(NVSS) 1.4 GHz continuum survey. This survey covers 
82% of the sky with i5 > -40° [11] with a source catalog 
containing over 1.8 x 10^ sources that is 50% complete 
at 2.5 mjy. It is appropriate for our purpose since most 
of the bright sources are AGN-powered radio galaxies 
and quasars whereas the less bright ones correspond to 
nearby star-forming galaxies. As a consequence, almost 
all the sources away from the Galactic plane (|6| > 2°) 
are extragalactic. 

We pixelize the NVSS catalog using HEALPix ^ 

maps with iVgide, = 256 corresponding to around 

14' square pixels [l05l. As an extra precaution, we re- 
moved sources with a flux greater than 1 Jy as well as 
a 1 degree disk around them. We also mask out pix- 
els at low Galactic latitude {\b\ < 10°) and those unob- 
served by the survey {5 < —36.87°). We ended up with 
1.29 X 10^ sources with an average density G = 159000 
gal/steradian. 



III. CMB LENSING 

Weak lensing by large scale structure remaps the CMB 
temperature field on the sky; the lensed temperature 
r(n) and unlensed temperature r(n) are related by [Slf 

f(n) =T(n + d(n)) (1) 

where d(n) is a vector field representing the deflection 
angles. To first order in perturbation theory, d(n) is 
expected to be a pure gradient: 

dain) = V„0(S) (2) 

where the scalar potential <j) is given by the line of sight 
integral: 

0(fi) = -2 dx (^^^^) ^(xn, 770 - x) (3) 
Jo \ XX* J 

where x denotes conformal distance along the line of 
sight in the assumed flat cosmology, x* is the confor- 
mal distance to recombination, and rjQ is conformal time 
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FIG. 2: Left panel: Auto power spectrum C/** of the CMB 
lensing potential, and reconstruction noise power spectrum 
iV/"^ (Eq. (gj) at three-year WMAP noise levels. Right 
panel: Cross power spectrum Cf^ between the CMB lens- 
ing potential and NVSS galaxy counts, and the efltective 
noise power spectrum [Nf"" Nf^ /2Y/^ for detecting the cross- 
correlation. The "boost" in signal-to-noise between the two 
cases is sufficient to obtain a several-sigma detection of CMB 
lensing. 



today. The integral in Eq. ([3]) receives contributions from 
a broad redshift range with median around 2^2. 

How can CMB lensing be detected in data? At the 
power spectrum level, lensing slightly smooths the acous- 
tic peaks in the temperature power spectrum Cj^ and 
adds power in the damping tail [53 |. However, these ef- 
fects are too small to be detectable in existing datasets. 
Going beyond the power spectrum, the effect of CMB 
lensing on higher-point statistics of the CMB is stronger 
and requires less instrumental sensitivity to dete ct )53l|. 

The theory of CMB lens reconstruction [H, Ha, [Sa . 
[S?! provides a framework for extracting this higher-point 
signal which we will use throughout this paper. One 
first defines a quadratic (in the CMB temperature T) 
estimator for the CMB lensing potential 0. The simplest 
higher-point estimator for detecting CMB lensing would 
be the power spectrum Cf'^: a quadratic estimator in the 
reconstruction or a four-point estimator in T . 

However, the three-year WMAP data do not have suf- 
ficient sensitivity to detect CMB lensing via the auto 
power spectrum Cf"^. This can be seen by considering 
the statistical "noise" in the reconstruction; in [U it is 
shown that the reconstruction noise power spectrum Nf'^ 
is given by: 
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2t 



+ 1 ^ 



(4) 



In Fig. [2] (left panel) we have shown the noise power 
spectrum Nf'^ for three-year WMAP sensitivity, with 

the fiducial signal power spectrum Cf'^ shown for com- 
parison. Although the CMB temperature anisotropics 
are signal-dominated across a wide range of angular 
scales (Fig. [T]), the lens reconstruction is highly noise- 
dominated. At this level of signal-to-noise, an "internal" 
(to WMAP) detection of CMB lensing, by measuring the 
auto power spectrum Cf"^, is not possible. 

It is frequently the case that a signal which is too 
noisy for internal detection can nonetheless be detected 
via cross-correlation to a second, less noisy signal. (For 
example, the first-year WMAP data had poor sensitivity 
to the EE polarization signal, but contained a many- 
sigma detection of CMB polarization via the TE cross- 
correlation [58]). In this paper, we will detect the lens- 
ing signal in WMAP by cross-correlating to radio galaxy 
counts in NVSS, thus detecting a nonzero cross power 
spectrum Cf^. The galaxy field g is much less noisy than 
(j) (Fig. [1]), l3ut the two fields have a significant redshift 
range in common and so are highly correlated; the cor- 
relation in the fiducial model is ~ 0.65 on angular scales 
i < 100. Therefore, the effective signal-to-noise is higher 
for the cross-correlation (Fig. [51 right panel) . A forecast 
based on this signal-to-noise ratio, and the assumption of 
simple /sky scaling, predicts that a ~ 3 — 4 sigma detec- 
tion can be made. If the same forecast is repeated using 
the parameters from 'S| (i.e. first-year WMAP sensi- 
tivity and Sloan LRG's over 4000 deg^), we find a 1 
sigma result, in agreement with previous results. 

In addition to the improved statistical errors from 
higher signal-to-noise, obtaining the detection as a cross- 
correlation is more robust to systematics, as we will see 
in detail in ifVl - Willi Any source of systematic contami- 
nation which appears in either WMAP or NVSS, but not 
both, will not bias our estimates for the cross power spec- 
trum Cf^, since it does not correlate the two surveys. At 
worst, such a contaminant can affect the statistical sig- 
nificance of the detection, by increasing the error bars on 
each bandpower. 

Our estimator for Cf^ will be defined by cross- 
correlating the quadratic reconstruction of the lensing 
potential (j) to the NVSS overdensity field g. Thus the 
estimator is three-point: two-point in the CMB temper- 
ature and one-point in the galaxy field. The same three- 
point estimator can also be derived from the general the- 
ory of bispectrum estimation [H, [13, [6l| . 

The most general three-point correlation between two 
CMB multipoles and one galaxy multipole which is al- 
lowed by rotational and parity invariance is of the form: 



where Fi-^i^i^ is defined by 
Fi^i^i^ = Gillies fiii2i3 



(5) 
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(8) 



This equation defines the bispectrum bij^g^e^ ■ (More prop- 
erly, with the Gi-^i^i^ prefactor included, we have defined 
the "reduced bispectrum" in Eq. ([5]); with this prefactor 
beiiits reduces to the flat sky bispectrum in the limit of 
large i p33 |. Whenever we write bispectra in this paper, 
£1,(2 are understood to denote CMB multipoles and 4 
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FIG. 3: Mean contribution to the squared total detection 
significance per NVSS galaxy multipole Is (left panel), 
and per unit increase in maximum CMB multipole t'^^ — 
max(£i, ^2) (right panel). Most of the statistical weight comes 
from galaxy multipoles near £ ~ 50, and CMB multipoles near 
£ ~ 400. 



denotes a galaxy multipole. 

From this perspective, the CMB lensing signal simply 
gives a contribution to the bispectrum which we want to 
measure. The lensing bispectrum is proportional to Cf^: 



(9) 



One can think of this as a single bispectrum which is 
estimated to give an overall detection, or a linear combi- 
nation of independent bispectra corresponding to band- 
powers in Cf^. 

In Appendix [Bl we show that the lens reconstruction 
and bispectrum formalisms are equivalent, so that it is 
a matter of convenience which to use. In this paper, we 
have generally used the lens reconstruction formalism, 
but will occasionally refer to the bispectrum formalism 
when it provides additional perspective. 

One issue which is clearer from the bispectrum per- 
spective is the distribution of statistical weight. Suppose 
we consider the total squared detection significance cr^, 
rather than splitting the signal into bandpowers. Start- 
ing from the bispectrum in Eq. one can write 
as a sum over multipoles (^1,^2,^3)- In Fig. [3l we have 
split up this sum to show the contribution per multipole. 
(Since there are two CMB multipoles, we show the con- 
tribution per unit increase in the maximum multipole 
^majf — max(£i,£2)-) It is seen that the greatest statis- 
tical weight comes from galaxy multipoles near ^3 ~ 50, 
and CMB multipoles near £ ^ 400 corresponding to an 
acoustic trough in the primary CMB. In bispectrum lan- 
guage, most of the signal is in "squeezed" triangles where 
the galaxy wavenumber is much smaller than the two 
CMB wavenumbers. This corresponds to the intuitive 
statement that lens reconstruction estimates degree-scale 
lenses indirectly through their effect on smaller-scale hot 
and cold spots in the CMB. 



IV. PIPELINE 

In this section, we describe our simulation and analy- 
sis pipeline for estimating the cross power spectrum Cf^ 
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FIG. 4: Simulation -I- analysis pipeline used in this paper; the 
stages (l)-(7) are described in detail in ijlVI 



from the WMAP and NVSS datasets, and present re- 
sults with statistical errors. (Systematics will be treated 
in Wl ^Vml ) 



A. Pipeline description 

Our pipeline is shown in Fig.d] Steps (l)-(4) represent 
the simulation direction and produce simulated WMAP 
and NVSS datasets with CMB lensing. Steps (5)- (8) are 
the analysis direction and produce power spectrum es- 
timates Cf^ in bands 6, starting from the WMAP and 
NVSS datasets. We now describe each step in detail. 

The first step (1) is simulating Gaussian fields: the 
unlensed CMB temperature, lensing potential, and (shot 
noise free) radio galaxy field g. We use the power spectra 
Cj'^,Cf''',Cl^,Cf'^ in the fiducial model. The last two 
are computed using the Limber approximation (e.g. [63| ) 
and a simple constant galaxy bias model: we take the 
galaxy overdensity to be given by the line of sight integral 



6g{n) 



X) 



(10) 



using a fiducial redshift distribution dN/dx and galaxy 
bias bg that will be discussed in the next section. 

In step (2), we compute the lensed CMB from the lens- 
ing potential and unlensed CMB. The lensing operation 



T{n) =r(S + d(n)) 



(11) 



is performed directly in position space (rather than rely- 
ing on an approximation to Eq. pip such as the gradient 
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approximation) . The right-hand side of Eq. (jlip is evahi- 
ated using cubic interpolation on a high resolution (w 0.5 
arcmin) map. 

In step (3), we simulate the eight Q, V, and W-band 
channels of WMAP. The maps are simulated at Healpix 
resolution A^side = 1024 and downgraded to -/Vgido = 512 
to minimize pixelization artifacts. To simulate each map, 
we first convolve with the beam and pixel window in har- 
monic space: 

aira BiWtatm (12) 

where Bi is the beam transfer function (distinct for each 
channel) and Wg is the pixel window function. We then 
take the spherical transform and add Gaussian noise to 
each pixel. The noise RMS is pixel-dependent but the 
noise is assumed uncorrelated between pixels. 

As the last step in the simulation direction, in step 
(4) we simulate NVSS, including clustering which is con- 
trolled by the Gaussian field g, by generating a Poisson 
galaxy count in each pixel p whose mean is given by 

A(p) =n(l + .g(p)) (13) 

where n is the mean number of galaxies per pixel over the 
survey. We simulate NVSS at A^sido = 1024 and down- 
grade to A^sidc = 256. 

Step (5) is the first step in the analysis direction: we 
start with the pixel-space maps corresponding to the 
eight Q, V, and W-band WMAP channels, and compute 
a single harmonic-space map a^m representing the inverse 
signal -I- noise filtered temperature a = {S + N)^^a. This 
reduction step is a common in gred ient in many types of 
optimal estimators [13, HH, [H, [H, [H, [13] ■ The general 
principle is that the filtering operation completely incor- 
porates the sky cut and noise model, so that optimal 
estimators can be constructed by simple subsequent op- 
erations directly in harmonic space. For example, the 
optimal TT power spectrum estimator is obtained by 
straightforwardly computing the power spectrum . 

Here and throughout the body of the paper, we will de- 
fer technical details of the estimators to Appendices \X[ iBl 
and concentrate on conveying intuition. In this case, the 
idea is that the {S + N)~^ filter simply weights each 
mode of the data by the inverse of its total variance, 
so that poorly measured modes are filtered out. For 
example, the sky cut is incorporated into the noise co- 
variance N by assigning infinite noise variance to pix- 
els which are masked (in implementation, we use 
rather than N and set the relevant matrix entries to 
zero). Data outside the sky cut is then completely fil- 
tered out: the map a is independent of the map values 
in masked pixels, and everything "downstream" in the 
analysis pipeline will be blind to the masked data. As 
a similar example, we marginalize the CMB monopole 
and dipole modes by assigning them infinite variance. 
Finally, the beam transfer functions (Eq. p2p ) are kept 
distinct in the filtering operation, so that optimal fre- 
quency weighting is performed: the filtered map aim will 
receive contributions from all frequencies at low £, but 
will depend mainly on the highest-frequency channels 
(i.e., the channels with narrow beams) at high £. The 



filtered map a = {S + N)~^a can also be thought of as 
the least-squares estimate of the signal, given data from 
all channels. 

In step (6), we perform lens reconstruction. Given the 
filtered CMB temperature aim from step (5), we compute 
the reconstructed potential (pemi defined by the equation: 

^imYemix) = V'^(a(x)Va/3(a;)) (14) 

where a and f3 are defined by 

a{x) = ^aimYemix) (15) 

(3ix) = ^Cj^S,™r,™(x) (16) 

£m 

As explained in [S^ . (f)£m is a noisy reconstruction of 
the CMB lensing potential (or more precisely, the inverse 
noise weighted potential N^^cj), where A^^ is the noise co- 
variance of the reconstruction) which is quadratic in the 

CMB temperature. Note that both a and cj) are defined 
in harmonic space, but Eq. (|14p involves multiplication 
and derivative operations in real space; in Appendix [B] 
we explain in detail how (j)^^. is computed. 

In step (7), we perform inverse signal -I- noise filter- 
ing on the NVSS data: given pixel-space galaxy counts, 
we compute the harmonic-space map gim = {S + N)~^g 
where the noise covariance A^ represents shot noise. This 
is analagous to the WMAP filtering operation in step (5), 
but there is one new ingredient. In addition to marginal- 
izing data outside the sky cut, and the monopole and 
dipole, we marginalize any mode which is independent of 
the angular coordinate (p in equatorial coordinates. (In 
harmonic space, this is equivalent to marginalizing modes 
with m — 0.) This is needed to remove a systematic ef- 
fect in NVSS which we will discuss in detail in |JVl for 
now we remark in advance that all results in this paper 
include this marginalization. 

Finally, in step (8), we compute the bandpower estima- 
tor C^^ by cross-correlating the fields (f>gm and gim from 
steps (6) and (7). There is one wrinkle here: as we show 
in Appendix[Bl to obtain the optimal estimator, we must 
include an extra term which subtracts the Monte Carlo 
average {</)) taken over unlensed simulations of WMAP: 

Ct = ^ E ii^^^n-iMrigim) (17) 

ieb ^ 

-i<ra<l 

where Mh is a normalization constant to be discussed 
shortly. (We have included the factor l/£^ since we es- 
timate bandpowers assuming that Pcf^ is flat in each 

band.) Note that the Monte Carlo average {(j)gm) vanishes 
for symmetry reasons in the case of full sky coverage and 
isotropic noise, but sky cuts or noise inhomogeneities will 
give rise to a nonzero average. The extra term in Eq. (jl7p 
simply improves the variance of the estimator by sub- 
tracting the spurious cross-correlation between this aver- 
age and the galaxy field g. 
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FIG. 5; Detection of CMB lensing via the cross power spec- 
trum Cf^ between the reconstructed potential and galaxy 
counts. The three la error bars on each bandpower repre- 
sent different Monte Carlo methods: WMAP simulations vs 
NVSS simulations (left/black), WMAP data vs NVSS simu- 
lations (middle/blue), and WMAP simulations vs NVSS data 
(right/red). These error bars represent statistical errors only; 
the result with systematic errors included will be shown in 
Ftg.[TM 



We determine the estimator normalization Afb by end- 
to-end Monte Carlo simulations of the pipeline, including 
a nonzero Cf^ in the simulations for calibration. (Strictly 
speaking, the normalization should be a matrix which 
couples bands 6^6', but we have neglected the off- 
diagonal terms, which are small for our case of large sky 
coverage and wide bands.) As we will see in Appendix iBl 
the normalization TVf, is proportional to a cut-sky Fisher 
matrix element, which must be computed by Monte Carlo 
unless an approximation is made such as simple /sky scal- 
ing. In addition, Monte Carlo simulations are also needed 
to compute the one-point term in Eq. p7p . 

This concludes our description of the pipeline. We have 
not motivated the details in the construction of our lens- 
ing estimator d^^, but in Appendix El we show that the 
estimator is optimal, by proving that it achieves statisti- 
cal lower limits on the estimator variance, so that the best 
possible power spectrum uncertainties are obtained. This 
justifies the combination of ingredients presented here: 
inverse signal -I- noise filtering (steps 5 and 7), keeping 
the lensing potential in harmonic space (step 6), and in- 
cluding the one-point term in the cross-correlation (step 
8); and shows that no further improvements are possible. 



B. Results 

The result of applying this analysis pipeline to the 
WMAP and NVSS datasets is shown in Fig. El We em- 
phasize that the uncertainties are purely statistical. Sys- 
tematic errors will be studied in i fVl Willi and an up- 



FIG. 6: CMB lensing detection obtained by analyzing Q-band 
(left/black error bar in each triple), V-band (middle/blue), 
and W-band (right/red) data from WMAP separately, show- 
ing consistency of the result between CMB frequencies. 



dated version of the result shown in ijIXl where we also 
show that the detection significance with systematic er- 
rors included is 3.4cr. 

Our error bars were obtained by Monte Carlo, cross- 
correlating simulations of WMAP and NVSS. As a con- 
sistency check. Fig. [5l shows that nearly identical er- 
ror bars are obtained if WMAP simulations are cross- 
correlated to the real NVSS data, or vice versa. This 
is an important check; if it failed, then we would know 
that our simulations were failing to capture a feature of 
the datasets which contributes significant uncertainty to 
the lensing estimator. In addition, it shows that the 
uncertainties only depend on correctness of one of the 
simulation pipelines. Suppose, for example, that the 
NVSS dataset contains unknown catastrophic systemat- 
ics which invalidate our simulations. Because the same 
result is obtained by treating NVSS as a black box to 
be cross-correlated to WMAP simulations, it is still valid 
(provided that WMAP contains no "catastrophic" sys- 
tematics!) 

As another consistency check, in Fig. [HI we show the 
detection that is obtained if each frequency in WMAP is 
analyzed separately. No signs of inconsistency are seen, 
although we have not attempted to quantify this pre- 
cisely: the results obtained from different frequencies are 
correlated even though the CMB noise realizations are 
independent, because NVSS is identical and so is the un- 
derlying CMB realization. For the same reason, we cau- 
tion the reader that the three sets of error bars in Fig. [51 
cannot be combined in a straightforward way to obtain 
an overall result. The best possible way of combining the 
data is already shown in Fig. [51 the maps from the three 
frequencies are combined into a single CMB map which 
is cross-correlated to NVSS. 
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C. Curl null test 

Our lensing estimator Cf^ detects a gradient compo- 
nent in the deflection field da via cross-correlation to ra- 
dio galaxy counts. If we instead decompose the deflection 
field into gradient and curl: 



1 — I — I — I — I — I — I — I — I — I — I — I — I — r 



(i,(n) = Va(/.(n) + e,6VV(n) 



(18) 



then one can similarly devise an estimator Cf^ to de- 
tect the curl component. Since the curl component is ex- 
pected to be absent cosmologically, this is a null test [6^ . 
Note that we have parameterized the curl component by a 
pseudoscalar potential V'j for notational uniformity with 
the gradient component which is parameterized by its 
scalar potential 0. 

In Appendix [B] we show that the optimal estimator is 
constructed as follows. First, we define a reconstructed 
potential ip which is quadratic in the CMB temperature: 

= e''^V,(a(x)Vb/3(x)) (19) 

with a, (3 as in Eqs. p^ . fTB]) . Second, we define a power 
spectrum estimator by cross-correlating to galaxy counts, 
subtracting the one-point term: 



Pi^g dcf 1 



-l<m<t 



1 ~ 



{ipim))*{gi 



(20) 



This construction is identical to our Cjsnstruction 
(Eqs. (fT4|. (fT7|l )) of the lensing estimator C^^ , except 
that a 90° rotation has been included (via the antisym- 
metric tensor Sab) in Eq- (|19p . 

The result of the curl null test is shown in Fig. [71 The 
for the null test is 12.1 with 8 degrees of freedom, so 
the null test passes. 

How strong is the null test obtained by demanding that 
C^g be consistent with zero? One might hope that as- 
trophysical contaminants, such as point sources or the 
Sunyaev-Zeldovich effect, would contribute both gradi- 
ent and curl components to the reconstructed deflections, 
and thus be monitored by the null test. However, parity 
invariance requires Cf^ = even when "0^0. Since 
astrophysical contaminants are expected to obey par- 
ity invariant statistics, they will not bias Cf^ on aver- 
age. Our null test therefore only monitors contaminants 
which can violate parity invariance, such as Galactic fore- 
grounds or instrumental systematics. This is analgous to 
the Cf^ — null test in CMB polarization experiments: 
it is not sensitive to all sources of contamination, but is 
nevertheless an important sanity check. 

We remark that for a detection of CMB lensing which 
is internal to the CMB (detecting lensing via the auto 
power spectrum Cf'^ , rather than the cross spectrum Cf^ 
considered here), one would have one null test (Cf''' ^ 
0) which can monitor parity-invariant contaminants, and 
one null test {Cf^ = 0) which cannot. 
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FIG. 7; Result of the curl null test (Cf^ = 0). As in Fig. [5] 
the three error bars on each bandpower represent different 
Monte Carlo methods: WMAP simulations vs NVSS simu- 
lations (left/black), WMAP data vs NVSS simulations (mid- 
dle/blue), and WMAP simulations vs NVSS data (right/red). 



V. NVSS SYSTEMATICS 

In the previous section, we obtained a statistical detec- 
tion of CMB lensing (Fig. [5|) by cross-correlating WMAP 
and NVSS, and showed that two consistency checks were 
satisfied: frequency independence (Fig. [6]) and a curl null 
test (Fig. [7]). The rest of the paper is devoted to studying 
potential instrumental and astrophysical contaminants 
of the lensing signal, to show that the observed lensing 
cross-correlation is not due to systematic contamination. 
In this section, we will consider NVSS systematics. 

If a maximum likelihood galaxy power spectrum is cal- 
culated from NVSS using the sky cut described in iJTTl the 
power spectrum C^® shown in the top panel of Fig. [8] 
is obtained. The very high bandpower in the lowest 
£ band is a clear sign of systematic contamination. If 
the low £ modes are isolated by low-pass filtering the 
NVSS galaxy counts to £ < 10, the resulting map shows 
azimuthal "striping" when plotted in equatorial coordi- 
nates (Fig. [9]). This is a known systematic effect in NVSS 
[TOj : due to calibration problems at low flux densities, 
the galaxy density has a systematic dependence on dec- 
lination, which can mimic long-wavelength modes in the 
galaxy field. 

To remove this contaminant, we analyze NVSS in equa- 
torial coordinates, and marginalize any modes in the data 
which are constant in the azimuthal coordinate ip. The 
marginalization is performed by modifying the NVSS 
noise model so that all such modes are assigned infinite 
variance, as described in Appendix [X] Thus any signal 
which is constant in ip is completely filtered out in the 
inverse signal-|-noise weighted map g which appears in 
our estimators. Note that treating the marginalization 
as part of the noise model means that the loss in sen- 
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FIG. 8: Maximum likelihood NVSS galaxy power spectrum, 
calculated without (top panel) and with (bottom panel) 
marginalization of m = modes in equatorial coordinates. 
In the bottom panel, fiducial spectra are shown (both for 
bg = 1.7) from the model for dN/dz by (dotted line) and 
our fit in Eq. [21] (dashed line) . 





FIG. 9: NVSS galaxy overdensity field in equatorial coordi- 
nates, low-pass filtered to multipoles £ < 10, showing visible 
azimuthal striping. 



sitivity due to marginalizing m — modes is already 
included in the statistical errors; it is not necessary to 
assign systematic errors separately. 

After including this marginalization in the analysis, 
the NVSS galaxy power spectrum shown in the bottom 
panel of Fig.[5]is obtained, showing reasonable agreement 
with our fiducial C^^ . Marginalizing m — modes pro- 
duces a large shift in the lowest bandpower and a much 
smaller shift in higher bands. In Fig. [TO] (top panel), we 
show the shift in each bandpower when m = modes 
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FIG. 10: Change ACf^ in maximum likelihood galaxy power 
spectrum, when NVSS is analyzed with m = marginaliza- 
tion vs no margnialization (top panel) or m = 0, 1 marginal- 
ization vs m = marginalization (bottom panel) in equatorial 
coordinates. The error bars represent the RMS shift obtained 
when Monte Carlo simulations are analyzed in the same way. 



are marginalized, relative to an error bar which shows 
the RMS shift obtained when the same marginalization 
is performed in NVSS simulations. It is seen that the 
shift is statistically significant not only in the lowest £ 
band, but all the way to £ ~ 100. We conclude that dec- 
lination gradients in NVSS are an important systematic 
on a range of scales and should always be marginalized 
in cosmological studies. 

Has marginalizing m — completely removed the sys- 
tematic? To answer this, we tried marginalizing the 
m — 1 Fourier mode in the azimuthal coordinate if, in ad- 
dition to the m = mode. In this case, we find (Fig. [TOl 
bottom panel) that the shift in Cf^ bandpowers is con- 
sistent with simulations. (There is a possible glitch at 
i ~ 200, but this is outside the range of angular scales 
which contribute to the lensing detection.) Therefore, 
we believe that marginalizing all modes with m = in 
equatorial coordinates completely removes the system- 
atic; there is no evidence that the contamination extends 
to higher m. 

In addition to declination gradients, there is another 
NVSS systematic which has been relevant for cosmolog- 
ical studies: multicomponent sources [tO, [Z1|. Radio 
galaxies whose angular size is sufficiently large to be re- 
solved by the 45-arcsec NVSS beam will appear as mul- 
tiple objects in the NVSS catalog. This can contribute 
extra power to the auto spectrum C|®, at a level which is 
a few percent of the shot noise. At worst, this could in- 
crease the variance of our cross-correlation estimator C^^ 
by a few percent without biasing the estimator. Further- 
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more, as can be seen in Fig. [8] (bottom panel), we see 
no evidence for galaxy power in excess of fiducial in the 
highest i band, which is most sensitive to this system- 
atic. We conclude that multicomponent sources are a 
negligible source of systematic error for CMB lensing. 

Next we consider uncertainties in the NVSS redshift 
distribution dN/dz and galaxy bias hg. These uncertain- 
ties affect our fiducial power spectra Cf^ , Cf^ in a given 
cosmology, and would need to be understood in detail 
if we wanted to constrain cosmological parameters from 
our lensing detection. However, since we are merely mea- 
suring the cross spectrum Cf^^ there is only one effect to 
consider: the Monte Carlo error bars we assign depend on 
the fiducial galaxy spectrum Cf^ used in the simulations. 
(We verified in simulations that the fiducial cross spec- 
trum Cf^ does not significantly affect the error bars.) 
If we use a fiducial Cf^ with too little power, we will 
underestimate our errors. Therefore, it is important to 
check that our fiducial Cf^ agrees with the galaxy power 
spectrum obtained from the data. 

Estimates for the radio luminosity function inspired by 
optical and infrared observations were given in [69|. Us- 
ing their mean-z, model 1 for average sources, [72| were 
able to reproduce the NVSS auto-correlation function 
rather well. However the dotted curve in Fig. [5] shows 
the galaxy power spectrum Cf^ , calculated using a mean 
bias oihg — 1.7 (in agreement with the values in [7ll.[7l|l 
and the same model for dN/dz. For our fiducial value of 
(Tg, the model power spectrum is deficient relative to the 
observed power spectrum. 

Therefore, we search for a NVSS redshift distribution 
that better reproduces our angular power spectrum mea- 
surement. We find that for bg = 1.7, a near Gaussian 
which is lopsided toward low redshift and centered at 
zo = l.f: 



the normalization of matter fluctuations erg or the total 
matter density f2o) from our measurements of the NVSS 
angular power spectrum and the cross correlation Cf^. 
We return to this issue in 




' 2(0.8)^ 
(z-zof 
2(0.3)^ 



{z < Zq) 
[z > Zq) 



(21) 



results in a good fit. This match to the NVSS angular 
power spectrum is shown in the dashed curve of Fig. [S] 
We have used this fiducial Cg^ in all simulations in this 
paper. 

We make no claim that our fiducial (dN/dz) is a more 
accurate model for the real NVSS redshift distribution 
than the previously considered model. It is just a device 
for generating simulations with the same power spectrum 
as the data, so that we do not underestimate our error 
bars. As a check, in Sect ion [TVl we compared Monte Carlo 
based error estimates for WMAP data versus NVSS data 
on one hand, and WMAP data versus NVSS simulations 
on the other, and obtained agreement (Fig.[5|). Using the 
dotted line in Fig. ([5]) would underestimate the power 
spectrum errors by ~ 20% due to the disagreement with 
the power spectrum seen in the data. We have not inves- 
tigated the reason for the disagreement in detail since it 
is somewhat peripheral to the primary purpose of this pa- 
per. However, the redshift distribution and galaxy bias 
assumed in the modeling would be critical if we were 
to infer constraints on cosmological parameters (such as 



VI. WMAP SYSTEMATICS 

Because our lensing estimator receives contributions 
from CMB anisotropics on small angular scales (Fig. [3]), 
the WMAP systematics most likely to affect the detec- 
tion are point sources and beam effects. In our pipeline, 
beam effects are incorporated by convolving the CMB 
with an isotropic beam (Eq. (H^) which is different for 
each DA. This is approximate in two ways: first, the 
real WMAP beams are not perfectly isotropic, but con- 
tain asymmetries which also convolve small-scale modes 
of the CMB by a sky varying kernel defined by the the 
details of the scanning strategy. Second, the isotropic 
part of each beam is not known perfectly; uncertainty in 
the beam transfer function acts as a source of systematic 
error in our lensing detection. We study these two effects 

in fvraifvrB] 

In WI C[ wc consider Galactic microwave foregrounds 
and show that their effect on the lensing detection is 
small. Point sources and thermal SZ will be treated sepa- 
rately in fjvnl fVlIIl The ISW effect H does not affect 
our lensing estimator, since the signal is negligible on 
CMB angular scales {£ ~ 400) which contribute. The 
Rees-Sciama effect [TJ] would give a small contribution 
on these scales, but we will ignore it since it is negligible 
compared to the SZ signal. 



A. Beam asymmetry 

The WMAP beams are asymmetric due to: 1) the 
feeds not being at the primary focus, and 2) substruc- 
ture caused by 0.02 cm rms deformations in the primary 
mirror [49]. The Q-band beams are elliptical with mi- 
nor/major axis ratio of ~ 0.8. The V and W-band beams 
show significant substructure at the —10 to —20 dB level, 
leading to « 0.7% distortions in the inferred power spec- 
trum 46] . 

Although deviations from azimuthal symmetry of the 
beams have a small effect when estimating the WMAP 
temperature power spectrum, it is unclear whether the 
same is true when estimating lensing. At an intu- 
itive level, CMB lens reconstruction recovers degree-scale 
modes of the lensing potential indirectly, through their 
distorting effect on smaller-scale hot and cold spots in 
the CMB. Beam asymmetries which convolve the small- 
scale CMB modes have a qualitatively similar effect and 
may be degenerate with lensing. For example, a beam 
quadrupole imparts an overall ellipticity or shear to the 
hot and cold spots. 

To incorporate beam asymmetry into our pipeline, we 
expand the beam profile in spherical harmonics Yig. The 
s = multipoles of the beam represent the azimuthally 
averaged beam and are already incorporated in both the 





FIG. 11: Result of convolving a single noiseless CMB realiza- 
tion with the WMAP VI beam, including beam asymmetry. 
We have shown the output map separated into contributions 
from different beam multipoles; s — (isotropic component, 
top left), s = 1 (top right), s — 2 (bottom left), and s — 3 
(bottom right). Each map has been scaled independently for 
visibility; the RMS temperature in the s = 0, . . . , 3 maps is 88, 
0.4, 1.0, 0.04 fiK. The convolution with the s > multipoles is 
scan dependent and shows alignments with the ecliptic poles 
reflecting the WMAP scan strategy. 



analysis and simulation directions of our pipeline. The 
higher-s multipoles have been estimated by the WMAP 
team and represent corrections to the azimuthally sym- 
metric approximation. In Appendix [Dl we show how to 
incorporate the higher multipoles into the simulation di- 
rection of the pipeline, generalizing the convolution in 
Eq. (|12|) . In contrast to the s = multipoles, convolving 
with the higher multipoles depends on the scan strat- 
egy; our method incorporates the details of the WMAP 
scan based on full timestream pointing. In Fig. 1111 we 
illustrate our simulation procedure for a single noiseless 
realization in V^-band, showing the contribution of the 
s — 0, . . .3 multipoles to the beam-convolved map. 

It would be very difficult to incorporate asymmetric 
beams into the analysis direction of the pipeline, so our 
approach is to treat beam asymmetry as a source of sys- 
tematic error. We assign each lensing bandpower C^^ a 
systematic error given by the Monte Carlo RMS change 
in the bandpower when the same WMAP -t- NVSS sim- 
ulation is analyzed with and without including beam 
asymmetry in the simulation pipeline. We find that the 
systematic error in each band is small compared to the 
statistical error. The result is shown, as part of a larger 
systematic error budget, in the "Beam asymmetry" col- 
umn of Tab. [Jin SjIXl 



B. Beam uncertainty 

We have shown that systematic errors from beam 
asymmetry are small, so that the beam may be treated 
as the simple convolution in Eq. to a good approx- 
imation. This leaves only one remaining beam-related 
source of systematic error: measurement uncertainty in 
the beam transfer function Bi. 

We model the beam transfer function uncertainty fol- 
lowing [46l . §A.2]. The beam covariance matrix is domi- 



FIG. 12: Foreground templates used in this paper, shown 
with KpO mask ( ^H^ applied. Left panel: Dust template, 
based on 74] with frequency dependence given by Eq. ((23} . 
Right panel: Free-free template, based on [tsI. [7a| with fre- 
quency dependence given by Eq. (|24)| . The masked RMS of 
the templates in V-band is 6.4 fiK and 4.8 /^K respectively. 

nated by a small number of modes. We SVD decompose 
the matrix for each DA and keep only the 10 most signif- 
icant modes. Then we construct realizations of the beam 
transfer function using 




where B^^^ is the standard beam transfer function, Ui 
are unit-variance normal random deviates, and are 
the beam covariance modes. 

Armed with this simulation procedure, we assign sys- 
tematic errors by computing the RMS change in each 
bandpower when the same simulation is analyzed with 
and without simulated beam uncertainty. We find that 
the systematic errors are extremely small. 



C. Galactic foregrounds 

In addition to the CMB, the sky at microwave fre- 
quencies contains other foreground signals which must 
be considered as a source of systematic error in lensing. 
We will find that the most important of these are point 
sources and the thermal Sunyaev-Zeldovich effect, which 
will be discussed in Will and Willi respectively. The 
other relevant microwave foregrounds are Galactic in ori- 
gin: dust, free-free emission, and synchrotron radiation. 
For descriptions of the foreground components, we refer 
the reader to ITal . 

Following [401, we will model dust contamination 
by adding a ternplate derived from "Model 8" from 
Finkbeiner et al [7l|, evaluated at 94 GHz and scaling 
to frequency i/ by: 

where Ta denotes antenna temperature. The dust tem- 
plate is shown in Fig. [T^l left panel. 

When we cross-correlate simulations of WMAP and 
NVSS, we find that including the dust template in the 
WMAP simulation results in a very small change in the 
estimated lensing signal. We take the Monte Carlo RMS 
average of the change in each bandpower when the same 
pair of simulations is analyzed with and without the tem- 
plate as a systematic error estimate, shown in the "Dust" 
column of Tab. [J in fjll 
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One might worry that this way of assigning systematic 
errors, based entirely on simulations, is too optimistic 
because it fails to account for unknown correlations be- 
tween the templates and the real datasets. As a check, we 
obtain consistent results if we cross-correlate an ensem- 
ble of WMAP simulations against the real NVSS data, 
or the real WMAP data (with and without template 
subtraction) against an ensemble of NVSS simulations. 
Finally, when the real WMAP and NVSS datasets are 
cross-correlated with and without template subtraction, 
the change in each bandpower is consistent with our sys- 
tematic error estimates, and no evidence for an overall 
bias is seen. 

We treat free-free emission similarly; in this case we 
use the full-sky Ha map from [75| . with the correction 
for dust extinction from [tI], and frequency dependence: 

where 62 = 6.7 ^K/Rayleigh and Iho denotes the Ha 
intensity. Again we find consistent systematic errors 
in the simulation-simulation, simulation-data, and data- 
data cases described in the previous paragraph. The re- 
sults are shown in the "Free- free" column of Tab.Uin §IX( 
the systematic errors from free-free are slightly higher 
than dust, but still small. 

Finally, we turn to Galactic synchrotron emission. The 
WMAP team has derived synchrotron templates both 
from the Haslam 408 MHz survey [T^, [T^I, and inter- 
nally by differencing the K and Ka band WMAP chan- 
nels 46]. However, both of these templates are intended 
for use at degree scales, and do not have sufhcient res- 
olution to measure the sychrotron signal on the angular 
scales {£ ^ 400) which contribute to our lensing estima- 
tor. Therefore, it would not be meaningful to assign sys- 
tematic errors from synchrotron emission by using either 
of these templates. 

In the absence of a template for synchrotron, the best 
we can do is to make the assumption that the synchrotron 
contamination at £ ~ 400 is comparable to the other 
Galactic foregrounds. In V-band, synchrotron, free-free, 
and dust emission all contaminate the CMB at roughly 
similar levels [76| . In addition, synchrotron and dust 
appear to have similar spatial distributions [46, Fig. 5], 
so the dust template should give us a reasonable estimate 
of possible synchroton contamination. However, a direct 
test of this assumption will have to await future higher- 
resolution measurements of synchrotron emission. 

These results and the consistency of our measurement 
between frequencies (Fig. lead us to conclude that our 
lensing detection is not contaminated by significant resid- 
ual foregrounds. However, we quantify it by assigning 
each lensing bandpower a total systematic error from 
foregrounds by adding the systematic errors from the 
dust and free-free templates (treating the two as corre- 
lated) and then doubling each RMS error to account for a 
synchrotron contribution with the same order of magni- 
tude. The result is shown in the "Total Galactic" column 
of Tab.Uin flXl 



VII. POINT SOURCE CONTAMINATION 

Point sources which are bright enough to be resolved 
by WMAP are excluded by the KpO mask (fjlT|, but unre- 
solved point sources act as a contaminating signal in the 
CMB. If the unresolved CMB point source signal were un- 
correlated to NVSS, we would not expect point sources 
to affect our lensing estimator C^^ significantly. How- 
ever, NVSS radio galaxies will contribute some nonzero 
flux at microwave frequencies and so appear directly as 
part of the point source contribution to the CMB. In ad- 
dition, CMB point sources which do not actually appear 
as objects in NVSS may be correlated to NVSS objects 
in some way, e.g. if both are tracers of the same large- 
scale potential. Therefore, point sources are a possible 
contaminant of our lensing detection. 

In this section, we will place limits on the level of point 
source contamination and assign systematic errors. Point 
sources will turn out to be our dominant source of sys- 
tematic error, and so we will devote considerable effort 
to constructing reliable error estimates. 



A. Point source estimator 

It is difficult if not impossible to construct a realistic 
model which would allow the level of point source con- 
tamination to be reliably estimated from general princi- 
ples. At radio and microwave frequencies, several popula- 
tions of point sources have been identified [H, [7^ HO, lU 
with significant uncertainties in spectral index and clus- 
tering properties. 

Therefore, our approach will be to estimate the level 
of point source contamination directly from the data. In 
this subsection, we will motivate and construct an es- 
timator which is optimized for detecting point sources 
instead of CMB lensing, to use as a monitor for point 
source contamination. The first candidate for the point 
source estimator is simply the cross power spectrum Cj^ . 

However, consider the following toy model for point 
sources: suppose that there are N distinct populations 
of unclustered Poisson point sources which appear as ob- 
jects in the NVSS catalog, and the z-th population has 
number density rii and constant flux per source Si at 
CMB frequencies. In this model, the cross power spec- 
trum is 

N 

CP oc S^n, (25) 
1=1 

whereas the bias to the lensing estimator is proportional 
to 

N 

Adf' oc J2 S^n, (26) 

i=l 

Because the right-hand sides of Eqs. (pS)) are not 

related in any model-independent way, one cannot trans- 
late a value of the cross spectrum Cj^ to an estimate 
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of the point source contamination in the lensing estima- 
tor, without making implicit assumptions about the point 
source model. 

For this reason, we next consider a different candidate 
for the point source estimator: the three-point estimator 
optimized to detect the "Poisson" bispectrum 



constant 



(27) 



where, following Eq. ([5]), ^1,^2 denote CMB multipoles 
and £3 denotes a galaxy multipole. (We will construct the 
estimator shortly; for now we "define" the point source 
estimator by writing down the bispectrum which we want 
to detect.) 

To motivate this form, we note that the bispectrum in 
our toy model is 



N 



(28) 



Comparing to Eq. (|26|) . we see that each point source 
population makes contributions to the Poisson bispec- 
trum and lensing estimator C'^^ which are proportional. 
Therefore, an estimate of the Poisson bispectrum will 
directly translate to a systematic error estimate for the 
lensing estimator. 

This aspect of our toy model illustrates a general 
point: a statistical contaminant, such as unresolved point 
sources, affects the lensing detection by making a contri- 
bution to the bispectrum bg-^i^ta which may be coupled 
to the lensing bispectrum (Eq. ([5])) which is measured 
by our estimator. Therefore, when trying to understand 
point source contamination, one should first ask: what 
bispectrum do point sources contribute? 

We will actually consider a more general point source 
bispectrum than the Poisson form in Eq. (j27p . which 
relaxes two assumptions of the toy model. First, we 
have assumed that point sources do not cluster (i.e., are 
purely Poisson distributed). Furthermore, we have as- 
sumed that each CMB point source appears as an object 
in NVSS; there is a second case to consider in which the 
point sources do not actually appear as objects, but are 
merely clustered in a way which is correlated to NVSS. 

Consider a population of clustered point sources which 
are tracers of a Gaussian field p. (We assume that the 
bias is absorbed into the definition of p, so that the prob- 
ability of a point source at position 2: is oc (1 -f pix)).) 
For our second case, where the point sources do not ap- 
pear as NVSS objects, a short calculation shows that the 
point source bispectrum is: 



(29) 



In the first case, where the sources do appear as NVSS 
objects, the bispectrum is given by: 



N 



N 



r'PP 



N 



{Ci: + CZ) (30) 



where {S) is the average temperature at CMB frequen- 
cies, n is the number density of the point source popula- 
tion, and N is the number density of NVSS. 



In Eq. ((30)) . the first term represents contributions 
from Poisson statistics, the second represents point 
source clustering on the galaxy angular scales (£ ^ 50) 
which contribute to the lensing detection, and the third 
represents clustering on CMB angular scales {I ~ 400). 
We will assume that the third term is small compared 
with the first two and can be neglected. This is a criti- 
cal assumption for our methodology and so we justify it 
carefully, giving two arguments. 

The first argument is that a realistic point source clus- 
tering power spectrum C^'' will be rapidly decreasing 
with £ and so the Cg factors in the third term (with 
I ~ 400) will be small compared with the factor in 
the second term (with £ ~ 50). 

The second argument is more formal and shows that 
the third term in Eq. ([50)1 is small compared to the first 
term. The ratio r of the third and first terms is given by 

< (1.59 X 10^)(2)(2.5 X lO"'^) 
= 0.04 (31) 

In the second line, we have used the fact that the con- 
tribution to the NVSS galaxy power spectrum from 
the point source population alone is given by AC^^ = 
(n/N)C^'' . In the third line, we have used our measure- 
ment of Cf (Fig. [HD, which shows that (Cf < 10'^ 
for £ > 400. The intuition behind this formal argu- 
ment is that if point source clustering were important 
on small angular scales, we would see this signal in the 
NVSS power spectrum. 

We have now shown that the most general point source 
bispectrum is a combination of Eq. (|29p . and Eq. (jSO]) 
with the third term neglected. This motivates our final 
choice of point source estimator: we will use the three- 
point estimator optimized to detect any bispectrum of 
the form 



(32) 



where Fi^ is arbitrary (our estimator will estimate Fi in 
bands). This generalizes the Poisson bispectrum consid- 
ered previously (Eq. (|77|) ). 

We have shown that Eq. ([5^ is a sufficiently general 
form of the point source bispectrum to allow an arbitrary 
clustering power spectrum between point sources, an ar- 
bitrary cross-correlation to the NVSS overdensity field 
g, and applies whether the CMB point sources actually 
appear as objects in NVSS, or are merely correlated to 
NVSS. Indeed, by putting an arbitrary £3 dependence in 
Eq. (j32p . we have been conservative by allowing a very 
general point source contribution. However, there is one 
caveat: we have assumed that point sources are biased 
tracers of Gaussian fields. Non-Gaussian contributions 
from nonlinear evolution have not been included. In halo 
model language [82], we have incorporated one- halo and 
two-halo terms in the bispectrum but not the three-halo 
term. 
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Now that we have determined the most general bis- 
pectrum contributed by point source contamination 
(Eq. (1221)), how do we construct the point source esti- 
mator? In Appendix iBl we show that the optimal esti- 
mator for this bispectrum is constructed in a way which 
is analagous to the lensing estimator C^" (or the curl null 
test d^^). First, we define a field s which is quadratic in 
the CMB: 



semYimix) = a{xy 



(33) 



where a{x) was defined previously in Eq. (|15p . Then we 
cross-correlate s to galaxy counts, subtracting the one- 
point term as usual: 



1 



E 

eeb 
-e<m<e 



(sern - {se.m))*{gim) (34) 



This defines the optimal estimator C^^ for the point 
source bispectrum (Eq. (P^ ). with the galaxy multipole 
^3 binned into a bandpower b. 

Intuitively, the field s can be thought of as a "quadratic 
reconstruction" of CMB point source power, in the same 
sense that is a quadratic reconstruction of the CMB 
lensing potential. Our estimator C^^ is obtained by cross- 
correlating 's to the filtered galaxy field 'g: we are only 
interested in point source power which is correlated to 
NVSS. By using C^^ to directly estimate the bispectrum 
due to point sources from data, we can assign systematic 
errors to the lensing bandpower C^^ which do not depend 
on the details of the point source model, as we will now 
see. 



B. Results 



In Fig. 1131 we show the result of applying the point 
source estimator C^^, constructed in the previous section, 
to the WMAP and NVSS datasets. The to zero is 
11.7 with 12 degrees of freedom. Therefore, no evidence 
for point source contamination is seen. This lets us put 
strong constraints on the systematic error in lensing due 
to point sources: the point source contribution must be 
small enough to be hidden in Fig. 1131 even though the 
estimator C^^ is optimized for point sources. The rest of 
this subsection is devoted to assigning systematic errors 
based on this observation. 

We find that for distinct bands h ^h' , the point source 
and lensing estimators in band h are uncorrelated to the 
estimators in band 6'. This is unsurprising; it follows 
from the definitions that the bands are independent for 
all-sky coverage and homogeneous noise, so that the only 
correlation is due to inhomogeneities. Since we have large 
sky coverage and wide bands, the correlations should be 
small. We will treat each band independently, for con- 
sistency with our point source model, which allows an 
arbitrary i dependence in the point source amplitude 
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FIG. 13: Point source estimator C^® applied to the WMAP 
and NVSS datasets, showing no evidence for CMB point 
source power which is correlated to NVSS. The error bars 
were obtained from Monte Carlo WMAP-I-NVSS simulations 
without point sources. 




10^ 10^ 
flux [mjy] 

FIG. 14: Histogrammed 1.4 GHz flux distribution in NVSS, 
with the fitting function in Eq. (|35p shown for comparison. 



(Eq. (|32|)). We will illustrate our method in detail for 
the band b = (Ci„,Cax) = (20,40). 

First, we use simulations to study the effect of point 
sources on the estimators , C^^ , using the following 
fiducial point source model. (We will show shortly that 
the final result does not depend on the details of the point 
source model.) Each simulated NVSS galaxy is assigned 
a randomly generated flux S'i.4ghz between 2 mJy and 1 
Jy, drawn from the distribution 



s- 



dN 



dS 1 + (S'/200 mJy) 



1.1 



(35) 



This distribution was obtained empirically from the flux 
distribution seen in the real NVSS data (Fig. HH). We 
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FIG. 15: Ensemble of simulations in the fiducial point source 
model (Eqs. (|35p . (I36|l l with varying point source amplitude 
A. For each realization, we show the observed point source 
level d^^ in the band 6 = (^miii,^max) = (20,40) and the 
change in the lensing estimator AC^® due to the point source 
contribution. The dotted vertical line shows the point source 
level in this band estimated from the real WMAP + NVSS 
data; the smaller vertical error bar shows the mean and RMS 
Act' among simulations whose observed point source level 
matches the measured value. 

then assign the flux 

at each WMAP frequency i', where A is a constant which 
will be varied to simulate different overall levels of point 
source contamination. Following J^, we take spectral 
index a = in our fiducial point source model. 

In Fig. [151 we show the values of the point source es- 
timator C^^ obtained in an ensemble of simulations with 
varying point source amplitude A, and the change AC^^ 
in the lensing bandpower which is due to the point source 
contribution. (Note that we do not show the true point 
source amplitude A for each simulation; we show the ob- 
served point source level C^^, estimated the same way as 
in the data.) 

We find that the results can be fit by treating AC^^ 
as a Gaussian variable with mean and variance which 
depend on C^^: 

{Ad^') = -aCf VariAdt') - f3^+l\C','f (37) 

where a = 0.38 /zR-^, (3 = 1.64 x 10"'^, 7 = 0.21 /iK^^. 

Based on this picture, how can we assign systematic 
errors due to point sources? Consider the distribution 
of ACf^ values obtained by considering only realizations 
whose observed point source level C^^ agrees with the 
value (= 1.3 x 10~^ A'K^) observed in the data (indicated 
by the dotted vertical line in Fig. [151) Note that this 



distribution includes realizations with a range of values 
for the true point source amplitude A; we are effectively 
averaging over point source levels allowed by the observed 
value of C^^ (i.e. the posterior distribution). By Eq. (|37p . 
we get a Gaussian distribution with parameters: 

ACf^ = (-0.5 ± 1.7) X 10-^ (38) 

indicated by the vertical error bar in Fig. 1151 

We have now arrived at an distribution (Eq. (|55)) ) for 

the change in AC"^^ which is due to the point source 
contribution. The central value of this distribution is 
nonzero; point source contamination makes a negative 
contribution on average, as can be seen in Fig. [T5l To be 
conservative, we will not shift our estimate for C"^^ in the 
positive direction by the central value (this would allow 
point sources to "help" the lensing detection), but will 
include the shift as part of the systematic error. Thus we 
would quote the systematic error in Cf^ as: ±2.2 x 10~^. 

As we have described it, this procedure appears to de- 
pend on the fiducial point source model (Eqs. (|35)) . ((SB])). 
However, we find that the final systematic error esti- 
mate in each band is relatively robust even under drastic 
changes to the model. We tried the following extreme 
cases: assigning constant flux to each source rather than 
using Eq. ([55|l . taking spectral index a = ±1 in Eq. 
rather than a — Q, and finally simulating point sources 
which are merely correlated to NVSS rather than appear- 
ing as NVSS objects. All of these models give similar 
results to within a factor ~ 2. (Note that our point 
source estimator in Eq. p4p is actually optimized for 
point sources with a blackbody spectral distribution, but 
these results show that we obtain robust systematic error 
constraints across a reasonable range of spectral indices.) 

Repeating this procedure for every € band, we obtain 
a systematic error estimate for each lensing bandpower 
Cf^. Since we have considered several point source mod- 
els, we assign the systematic error for each band using the 
model which gives the largest error in that band. The re- 
sults are shown in the "Resolved point source" column in 
Tab. [Jin ijIXI We find a systematic error which is smaller 
than the statistical error in all bands, but is the largest 
overall source of systematic error. 

The relative robustness of our error estimate to the 
point source model is consistent with our discussion in 
the previous subsection: regardless of the details of the 
model, the contamination to the lensing estimator is pro- 
portional to the level of the point source bispectrum 
(Eq. (|5^ ) contributed by point sources. By directly esti- 
mating the bispectrum, we can obtain a relatively model- 
independent constraint on the systematic error due to 
point sources. This would not be possible if a simpler 
statistic were used, such as the cross power spectrum 

The procedure we have described is similar to the 
Fisher matrix based method that is frequently used to 
marginalize point sources when estimating primordial 
non-Gaussianity from the CMB bispectrum [59] , but dif- 
fers in several details. First, we use a general form of the 
point source bispectrum (Eq. (|32p ) which allows point 
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FIG. 16: Illustration of our procedure (Eq. (|39j) for simu- 
lating correlations between the NVSS galaxy field (left) and 
source mask (right). For visibility, we have bandlimited the 
galaxy field 5g to £ < 6, and used 100 sources with masking 
radius 4° rather than the mask parameters of the datasets 



source clustering, and also allows CMB point sources to 
appear or not appear as NVSS objects. Second, we do 
not shift the lensing estimator by the central value of the 
posterior distribution in Eq. ([55]) . but treat the shift as 
part of the systematic error. Finally, the Fisher matrix 
formalism would not predict the increased variance in 
AC*!^ in the presence of point sources (Eq. ([57)1 ). This is 
included in the Monte Carlo based procedure presented 
here. The Fisher matrix does predict the overall negative 
slope in Fig. 1151 which is a property of the point source 
and CMB lensing bispectra. As a check, if we directly 
compute the Fisher matrix (see Eq. I|43p below), we find 
a weak negative correlation (w —0.1 in each band) be- 
tween the lensing and point source shapes. 



C. Resolved point sources 

Now that we have analyzed systematic errors in lens- 
ing from unresolved CMB sources, we consider resolved 
sources. Resolved CMB point sources have been treated 
in the pipeline by simply masking each source fiflB. If 
the sources are correlated to radio galaxies, so that the 
WMAP mask is correlated to NVSS, one may wonder 
whether the masking procedure can bias the lensing de- 
tection. 

We can prove the following general result (Ap- 
pendix [U|) : in the absence of CMB lensing, correlations 
between the mask and galaxy field cannot fake the lens- 
ing signal, i.e. the expectation value (C^^) is zero even if 
the mask is correlated. Interestingly, our proof depends 
on the presence of the one-point term in the estimator 
(Eq. ([T7)) ') and does not rule out the possibility of bias if 
this term is omitted. 

Given this general result, the lowest-order effect that 
might be expected from mask-galaxy correlations is a bias 
proportional to the lensing signal, i.e. a calibration er- 
ror. We looked for a calibration error in simulations, by 
randomly generating a point source mask by assigning a 
point source to pixel x with probability 

^^^^ otherwise ^"^^^ 

(This is an extreme case, corresponding to a linear bias 
model p{x) oc 1 -I- b{dg{x)) in the maximally biased limit 



b 0.) An example of this simulation procedure is 
shown in Fig. 1161 

With the source mask density of the real datasets (|JI1, 
we see no evidence for a calibration error after 1024 
Monte Carlo simulations of the full pipeline. The same 
result was obtained replacing the NVSS overdensity Sg by 
the lensing potential (j) on the right-hand side of Eq. (I39p . 
or bandlimiting the right-hand side for several choices of 
£ band. 

Since we do not have a general proof that the cali- 
bration error is small, we can only conclude that it is 
smaller than the ~ 3% statistical limit from our Monte 
Carlo sample. In Tab. [H we have assigned each band- 
power a 3% systematic calibration error in the "Resolved 
point sources" column, but we see no evidence for the 
effect and it may be much smaller. 



VIII. SUNYAEV-ZELDOVICH FLUCTUATIONS 

A further source of possible contamination of the 
WMAP-NVSS correlation comes from re-scattering of 
the primordial microwave background off hot electrons 
inside the large scale structure field that also underlies 
the distribution of NVSS sources. The largest effect is 
the thermal Sunyaev-Zel'dovich (SZ) effect 22, 23], due 
to inverse Compton scattering which shifts photons away 
from their originally black-body spectrum. The kinetic 
Sunyaev-Zel'dovich (kSZ) effect, due to Doppler scat- 
tering of CMB photons by large scale structure moving 
along the line of sight, is expected to be a concern for 
lensing reconstruction with future CMB experiments that 
are able to frequency clean the thermal effect |83|, H^l . On 
the angular scales relevant for WMAP, the kinetic effect 
is much smaller and more Gaussian than the thermal ef- 
fect, and we neglect it in the following analysis. 

The induced temperature change of the thermal SZ 
compared to the CMB, AT(n)/TcMB = g{'^)y, is pro- 
portional to the line of sight integral over the cluster gas 
pressure, y = (the Compton-y parameter) , 

where rig is the free electron density, ks the Boltzmann 
constant, Tg the electron temperature, me the electron 
mass, and ctt the Thomson scattering cross section. It 
also has a characteristic frequency dependence, given in 
terms of the dimensionless frequency x = -j^^ by 

g(x)=x4^-4. (40) 
— 1 

This frequency dependence causes ~ 13 (18)% changes in 
the expected amplitude of the SZ between the WMAP V 
(Q) and W channels. These differences are smaller than 
the statistical error of our WMAP-NVSS cross correla- 
tion measurement, making it impossible to distinguish 
the SZ effect from lensing on frequency basis alone. 

We therefore rely on angular separation. Our preferred 
way to describe the SZ effect and assign systematic errors 
would be to use full hydrodynamical simulations of the 
effect (e.g. 85, 86, 87]). Unfortunately these have to date 
only been performed on scales of ~ 100 comoving Mega- 
parsec, allowing modeling of secondary anisotropics on 
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scales of only a few square degrees. Our lensing estima- 
tor on the other hand receives contributions from I > 20, 
requiring simulation on scales substantially larger than 
10 X 10 square degrees. A somewhat less computation- 
ally expensive route would be to establish halo catalogues 
based on perturbation theory schemes (e.g. jH,!!^) that 
are then decorated with semi-analytic gas pressure pro- 
files. Even these procedures are however very costy for 
our purposes of covering 40,000 square degrees on the sky 
at a depth of about 4 comoving Gigaparsec, under the 
necessary requirement of resolving halos down to 10^'^ 
solar masses in order to reliably model SZ fluctuations 
below / = 1000 0. 

As we will argue in this section however, on the scales 
relevant to a lensing detection using WMAP, SZ contam- 
ination can be treated as part of the point source contri- 
bution which has been studied in the previous section. 

To begin with, notice that although at WMAP fre- 
quencies SZ clusters contribute a temperature decrement 
to the CMB, their contribution to the point source esti- 
mator C^® is positive, because the estimator is quadratic 
in the CMB. Therefore our "point source" estimator will 
be able to serve as a monitor for the sum of point source 
and SZ contamination. This is yet another advantage of 
the three-point estimator over the cross spectrum Cj^ 
discussed in ^VU Al because point sources make a posi- 
tive contribution to the cross spectrum but the SZ contri- 
bution is negative, the cross spectrum cannot constrain 
both contaminants. 

Next consider the spatial distribution of SZ. The vast 
majority of the thermal SZ signal stems from collapsed 
regions with a gas density contrast of hundreds of times 
the mean density of the universe (see e.g. ^86]). If clus- 
ter profiles could be approximated as 5-functions, then 
they could be treated as biased tracers of large scale 
structure that is correlated to NVSS galaxies. Since our 
point source model (Eq. ([5^ ) allows clustering and cross- 
correlation to NVSS, this would allow us to treat the SZ 
contribution as part of the point source contribution. 

To quantify the deviation from pointlike profiles, in 
Fig. 1171 we show galaxy cluster profiles in angular mul- 
tipole space, calculated with the universal gas-pressure 
profile model of [9l|, at z^O.l and z=1.0. This rcdshift 
range is chosen to span roughly the range where the SZ 
might be correlated to NVSS sources. It can be seen 
that many of the relevant clusters fall below the angular 
scale {£ ^ 400) where our lensing reconstruction gathers 
most of its information, but some large nearby SZ clus- 
ters have profiles as extended as tens of arcminutes, and 
show some slope at the relevant angular scales. 

To determine whether this slope is important at 
WMAP resolution, we consider the angular power spec- 
trum Cg^ , which is an average over redshift and mass of 
all clusters. In cross correlation with NVSS, this integral 
would be modulated by the source redshift number den- 
sity. Since the NVSS redshift distribution is not very well 
understood, here we apply uniform weight to all objects 
to obtain an estimate for the scale dependence of the 
power spectrum. We calculate the power spectrum in- 
cluding both the Poisson (1-halo) and clustering (2-halo) 
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FIG. 17: The Compton-y profile for three different cluster 
masses at z—1 (thick lines) and z—Q.l (thin lines). The pro- 
files have been normalized to 1 at 1=0 to facilitate comparison. 
According to this panel, at high redshift it may be possible to 
approximate even rare and massive clusters as point sources 
on the scale where our lensing estimator gathers most of its 
information, I ~ 400. 




FIG. 18: The thermal Sunyaev-Zel'dovich angular power 
spectrum contributions (in the Rayleigh- Jeans limit) from 
Poisson and clustering terms. On the scale of most inter- 
est for our lensing reconstruction, I ~ 400, the SZ Poisson 
term dominates by an order of magnitude over the clustering 
part. The angular power spectra were calculated using the 
gas pressure profile model by [ool. loH. 



contributions, following the formalism of [90, 1221 • The re- 
sults are shown (for the low frequency (Rayleigh- Jeans) 
limit in which ?/ = — 2) in Figure fTSl 

It is seen that the SZ power spectrum is not flat at 
£ 400, owing to the contribution of the most massive 
and nearby clusters, but has the rough scaling 



Ci cx £-1-2 
over the relevant range of angular scales. 



(41) 
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We can incorporate this scale dependence into the 
analysis by considering a bispectrum of the form 



6x10-6 



(42) 



To quantify the effect of scale dependence on the lensing 
estimator, we compute the correlation between this shape 
and the point source shape (Eq. ([5^ ). using the Fisher 
matrix formalism [93| . According to this, the Fisher ma- 
trix element between two bispectra b'f^l^^^^b^f^^^^^ is de- 
fined by 



2 ^^^^ iCj^ + Nl^){Cl^ + Nl^){Cr, + Nf^^) 

(43) 

To a good approximation, when bispectra are estimated 
from data, the covariance matrix is given by: 



Cov(5W,6(''))=/-,iF-,i 



(44) 



When we compute the Fisher matrix for the point source 
(Eq. (122])) and scale-dependent (Eq. (HU) shapes at 
WMAP and NVSS noise levels, we find a correlation co- 
efficient ~ 0.95. At this level of correlation, the point 
source shape and SZ shape can not be distinguished to 
1(7, unless a 6(7 detection of the point source shape can 
also be made. Since we do not find any evidence for point 
source contamination in the data (Fig. [T5|) . we conclude 
that the difference between the point source and SZ bis- 
pectra should be negligible in the context of the WMAP 
and NVSS data sets. 

As an additional check, we tried modifying our point 
source simulations by giving each point source an cx 
£-0.6 pj-Qfiie, and SZ frequency dependence (Eq. (HH])), 
including the negative sign. This crude procedure is of 
course not an accurate method for simulating SZ in de- 
tail, but does incorporate two qualitative features which 
distinguish SZ from point sources at WMAP resolution: 
the scale dependence (Eq. I4ip and frequency dependence 
(Eq. HD|) . We find that the systematic errors in lensing 
(obtained from Monte Carlo simulations as described in 
i jVIII) are within the range of point source models previ- 
ously considered, showing that neither of these deviations 
from pure point source behavior significantly affects our 
method. 

Finally, there is one assumption in our point source 
model which we can check explicitly for the case of SZ: 
that clustering is unimportant on scales of I ~ 400 (see 
Eq. [50)1 . This can be seen directly from Fig. [THJ the 
clustering term is dominated by the Poisson term by an 
order of magnitude. 



IX. FINAL RESULT AND DISCUSSION 

In Tab.[T]and Fig. [111 we show our final result: an esti- 
mated value oiCf^ in bandpowers, together with statisti- 
cal and systematic uncertainties. Our procedure for com- 
bining errors is as follows. We combine the errors from 
beam asymmetry ( WI Al) and beam uncertainty f WI B[) 



4x10-8 - 



^ 2x10-8 



- 



n — I — I — I — I — I — I — I — I — I — I — I — I — r 

Fiducial 



— 2x10-8 ' ' ' ' ' ' ' ' ' ' ' ' ' ' '- 

100 200 300 

Multipole 1 



FIG. 19: Final result from Tab. HI showing statistical errors 
alone (blue/inner error bars) and statistical -I- systematic er- 
rors (red/outer). 



into a "total beam" error assuming that the two are com- 
pletely correlated. We obtain a "total Galactic" error 
from Galactic CMB foregrounds by combining the dust 
and free-free systematic errors ( WI C|) assuming corre- 
lated errors, and double the result to account for syn- 
chrotron (where no template is available on the relevant 
angular scales). We obtain a "total point source" error 
by combining the errors from unresolved and resolved 
sources, assuming that the two are correlated. (As we 
have shown in Willi the "point source" errors apply to 
the total systematic error from CMB point sources and 
the thermal SZ effect.) We then obtain our final result 
by combining the statistical, total beam, total Galactic, 
and point source errors, assuming that the four are un- 
correlated. 

What is the total statistical significance of our detec- 
tion? To assess this, we combine our bandpower esti- 
mates into a single estimator C, giving each bandpower a 
weight proportional to its fiducial expectation value C^g^ 
(not the measured value in Tab. |T| and inversely propor- 
tional to its total (statistical + systematic) variance: 



Efc (gfc'L/Var(g^)) Ct 
E.(<L)VVar(a,*^) 



(45) 



wiiere the denominator has been included to normalize 
(C) = 1 in the fiducial model. We find C = 1.15 ± 0.34, 
i.e. a 3.4cr detection. 

Throughout this paper, we have assumed a fiducial 
cosmology, NVSS redshift distribution, and galaxy bias 
when computing statistical errors by Monte Carlo simu- 
lation, and when constructing the {S+N)'"^ filters in the 
analysis pipeline. To what extent do our results depend 
on the fiducial model? Our Cf^ bandpowers and error 
bars depend only on the fiducial power spectra Cj"^, Cf^ 
used in Monte Carlo simulations, not on the details of 
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Beam 






Galactic 




Point 


source ± SZ 




(■^min ; ^max ) 


Statistical 


Asymmetry Uncertainty Total 


Dust Free-free Total 


Unresolved Resolved 


Total 


Stat ± systematic 


(2,20) 


17.4 ±22.4 


±0.9 


±0.3 


±1.2 


±0.4 


±1.4 


±3.6 


±10.9 


±0.5 


±11.4 


17.4 ±27.4 


(20,40) 


33.2 ±10.5 


±0.2 


±0.1 


±0.3 


±0.2 


±0.5 


±1.4 


±4.9 


±1.0 


±5.9 


33.2 ± 13.0 


(40, 60) 


15.9 ±7.8 


±0.1 


±0.1 


±0.2 


±0.2 


±0.3 


±1.0 


±2.8 


±1.5 


±4.3 


15.9 ±9.3 


(60, 80) 


10.1 ±6.3 


±0.1 


±0.1 


±0.2 


±0.1 


±0.3 


±0.8 


±2.0 


±0.3 


±2.3 


10.1 ±7.0 


(80, 100) 


5.1 ±5.8 


±0.1 


±0.1 


±0.2 


±0.1 


±0.3 


±0.8 


±1.1 


±0.2 


±1.3 


5.1 ±6.0 


(100, 130) 


8.3 ±4.3 


±0.1 


< 0.1 


±0.2 


±0.1 


±0.2 


±0.6 


±0.6 


±0.2 


±0.8 


8.3 ±4.4 


(130,200) 


1.6 ± 2.5 


< 0.1 


< 0.1 


±0.1 


±0.1 


±0.1 


±0.4 


±0.3 


±0.1 


±0.4 


1.6 ±2.6 


(200, 300) 


-1.9 ± 2.2 


< 0.1 


< 0.1 


±0.1 


±0.1 


±0.1 


±0.4 


±0.3 


±0.1 


±0.4 


-1.9 ±2.3 



TABLE I: Final estimated C^® bandpowers, together with statistical uncertainties and systematic errors from point sources. 
All entries in the table are l^Cp in multiples of 10"^ 



the modeling. We have checked these fiducial spectra 
in two ways: first, by direct comparison with the mea- 
sured NVSS power spectrum (Fig. [5]); we have omitted 
the comparison for the WMAP power spectrum since our 
fiducial cosmology is the WMAP±ALL cosmology from 
0. Second, we have shown that consistent statistical 
errors are obtained by cross-correlating simulations with 
data (Fig. [5|). The fiducial model is also used to con- 
struct the {S + N)~^ filtering operation, but in this case 
using incorrect power spectra merely makes our estima- 
tor slightly suboptimal and does not significantly affect 
the detection. 

The statement that our result only depends on the 
fiducial spectra Cj"^, C^^, not on the details of the model, 
would not be true if we were attempting to translate our 
measurement of Cf^ into a constraint on cosmological pa- 
rameters. There are several obstacles to doing so which 
we plan to address in future work. First, Cf^ depends on 
cosmology but is also proportional to the NVSS galaxy 
bias hg, which must be marginalized. One possible ap- 
proach is to only consider quantities such as 

Ct'l\[cf (46) 

which should be independent of galaxy bias (ignor- 
ing subleties like redshift-dependent bias). Second, the 
NVSS redshift distribution dN j dz is uncertain and must 
also be marginalized over some reasonable range. We 
note that the auto power spectrum C^^, which appears 
in Eq. is more sensitive to changes in dN/dz than 

the cross spectrum Cf^. A conservative approach to 
marginalizing over cosmological parameters as well as 
redshift and bias uncertainties would be the Markov 
chain Monte-Carlo (MCMC) method (compare ^) ap- 
plied to both Cf^ and Cf^ constraints. 

Finally, we have not considered the impact of magni- 
fication bias: the observed NVSS galaxy field is altered 
by the magnifying and demagnifying effect of gravita- 
tional lenses between the source galaxies and observer 
[95I . [11]. One can think of this as adding terms to the 
galaxy field g{ii) which depend on the matter distribu- 
tion at intermediate redshifts along the line of sight. This 



introduces additional terms in Cg which are not in- 
cluded in our fiducial spectrum, and have been shown 
to be significant when deducing cosmological constraints 
from ISW measurements [93] • In a magnified region, the 
galaxy surface density g{n) receives a negative contribu- 
tion (since magnification spreads a fixed number count 
over a larger area) and a postive contribution (since mag- 
nification brings new galaxies above the flux threshhold 
of the survey), so the effect can have either sign. Note 
that magnification bias affects the fiducial Cf^ in a given 
cosmology, but docs not affect our measured values of 
Cf^ or the statistical significance of the detection. 

We have constructed an estimator for the lensing 
cross-correlation Cf^ which is probably optimal (Appen- 
dices |B|). The estimator is defined in three steps. 
First, we filter the WMAP and NVSS datasets by their 
inverse signal + noise covariance, thus "distilling" the 
datasets to harmonic-space maps a{m,9tm- Second, we 
perform lens reconstruction on the filtered WMAP data 
o-e-nn producing a noisy reconstruction of the CMB 
lensing potential which is quadratic in the data. Third, 
we cross-correlate (p and g, subtracting the one-point 
term. 

Subtracting the one-point term is necessary to make 
the estimator optimal, and also eliminates systematic 
bias from resolved point sources (jVII Cp , although a sys- 
tematic calibration error may remain. Since the one- 
point subtraction is trivial to implement in a Monte Carlo 
pipeline, we recommend that it always be used. The 
other feature making our estimator optimal is full-blown 
{S + N)~^ filtering (Appendix [A]) . Here, it is unclear 
whether the optimal filter is practically necessary; it may 
be possible to construct a simpler filter which approxi- 
mates {S + N)~^ and produces near-optimal estimates in 
practice. In any case, an optimal implementation is an 
invaluable tool when studying candidates for such a filter, 
since the results can be directly compared to optimal. 

We have studied potential sources of systematic er- 
ror from known NVSS systematics ( ifV)) . WMAP beam 
effects (fjVT^fjVIB]) Galatic microwave foregrounds 
(i jVI CI) , point sources ( WIip . and the thermal Sunyaev- 
Zeldovich effect ( WIIip . Error estimates from each of 
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these systematics have been included in our final result. 

The most problematic systematic for CMB lensing, at 
least when measured in cross-correlation to large-scale 
structure, seems to be point source contamination. In 
general, a statistical contaminant such as point sources 
affects the lensing detection by contributing some bis- 
pectrum bi-^i^i^ which may be correlated to the lens- 
ing bispectrum which our estimator measures (Eq. ([5])). 
We therefore treat point sources by directly estimating 
the point source bispectrum from the data, to moni- 
tor the level of contamination and assign systematic er- 
rors. We allow a form of the point source bispectrum 
(Eq. ([5^ ) which is sufficiently general to include a wide 
range of point source models, including clustered sources 
and sources which may or may not appear as objects in 
NVSS. 

We have argued that at WMAP sensitivity levels, 
thermal Sunyaev-Zel'dovich fluctuations due to hot gas 
in clusters of galaxies can be treated as part of the 
point source contribution. We checked that the level of 
scale dependence in the bispectrum, introduced by large 
nearby objects, is unimportant at WMAP resolution, but 
we do not expect this to be the case for smaller scale ex- 
periments such as Planck [lOl], ACT [H, or SPT [H, 
which will begin to observe the sky in the near future. In 
fact, even the qualitative trends we have found in Tab. U 
for systematic error contributions may be different for 
these future surveys, which will probe new regimes of 
sensitivity and resolution. The detection from WMAP 
that has been presented here is a milestone toward de- 
tailed measurements of CMB lensing that lie ahead. 
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APPENDIX A: FAST {S + N)-^ FILTERING 

In this appendix, we present the details of our method 
for computing the inverse signal -I- noise weighted map 
a= {S + N)-^a, for either the WMAP or NVSS data. 



Outside the context of lens reconstuction, this inver- 
sion problem also arises for other types of optimal analy- 
sis in which the data is weighted by inverse signal -I- noise, 
e.g. optimal power spectrum estimation f6S|, power spec- 
trum analysis by Gibbs sampling [66, 67], and bispectrum 
estimation [6l| . We expect that our method will be useful 
in these contexts as well. 



1. Conjugate gradient inversion 

First, let us introduce some notation. We assume a 
dataset which is specified by A^chan pixel-space maps, 
with a common underlying harmonic-space signal S£,„. 
Thus we can write 

df"" = A,s + (noise) (Al) 

where Ai is the pointing matrix associated to the z-th 
channel. 

This generality is sufficient to describe both the 
WMAP and NVSS datasets. For WMAP, we have 
-^chan — 8 corresponding to the eight Q, V, and W-band 
differencing assemblies used in the analysis, the signal 
Sim is the noiseless CMB, and each pointing matrix Ai 
includes convolution with the pixel window function and 
beam of the corresponding DA. Our convention is that 
the signal s is defined in harmonic space, while the data 
d^"^ is defined in pixel space. Thus the operator Ai in 
Eq. (|Aip is defined by applying beam and pixel window 
functions to the harmonic-space signal (see Eq. ([72]) ). 
then taking the spherical transform to produce a map in 
pixel space. For NVSS, we have iVchan = 1 corresponding 
to a single galaxy count map, with no beam convolution 
included in the pointing matrix A, since the 45-arcsec 
NVSS beam can be neglected on angular scales {£ < 250) 
which contribute to the lensing estimator. 

In [9^, it is shown that the data in Eq. (|Aip can be 
reduced to a single harmonic-space map a, with associ- 
ated noise covariance matrix N, without losing informa- 
tion. The map a and matrix TV are defined by the pair 
of equations 

iV-i = Y^AjiNrr'Ai (A2) 

i 

N-'a - ^Af(iVf'^)-idf" (A3) 

i 

where iVf is the noise covariance associated to the z-th 
map. 

Let us first assume a noise model (which we will gen- 
eralize in 3p such that the inverse noise covariance 
^jypix^-i ^j^g map is diagonal in pixel space. For 
WMAP, this is the noise model used to analyze the tem- 
perature power spectrum [11]; for NVSS, the diagonal 
noise covariance represents shot noise and is constant be- 
tween pixels. (In both cases, a sky cut is incorporated by 
setting to zero inside the mask.) In this noise model, 
it is trivial to compute N~^a using Eq. (|A3p . but what 
we need in our analysis pipeline is a = (S + N)^^a. Note 
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that is generally not invertible due to the presence 
of unconstrained modes (such as pixels excluded by the 
sky cut), so that a is not determined by Eq. (jA3|) . but 
the data do determine N~^a, and having this is sufficient 
for a. The remainder of this appendix is devoted to an 
algorithm for computing 

Following [67| , we will find it convenient to replace the 
matrix {S + N)~^ by the matrix X^^, where 

X l + S'/^N-'S'^' 

= l + ^5i/2Af(7Vr)-M,5i/2, (A4) 



Using the identity {S + N)-^a = S-^/^X~^S^/^N-^a, 
it suffices to give an algorithm for multiplying a map by 
X^^. Since the number of degrees of freedom is too large 
for direct matrix inversion, this multiplication must be 
performed using conjugate gradient inversion [99| . Per- 
formance of the conjugate gradient method depends on 
a good choice of preconditioner, or linear operator which 
is efficient to compute and approximates X~^. 

A common way to construct a preconditioner is to re- 
place X by some simpler approximation X' which can be 
inverted exactly, and use {X')^^ as the preconditioner. 
The simplest preconditioner of this type would be X'^^, 
where X^ is the matrix defined by keeping only the di- 
agonal of X. 

With this diagonal preconditioner, we have found that 
the conjugate gradient search will eventually converge, 
but the convergence is extremely slow. To understand 
why it is slow, note that X^^ will only be a good ap- 
proximation to X~^ when X is diagonally dominated. 
This will be the case on angular scales which are noise- 
dominated {S/N <^ 1), since X will be close to the iden- 
tity matrix, but on large angular scales where the signal 
dominates, the preconditioner is not a good approxima- 
tion to X~-^, and the convergence rate becomes limited 
by these scales. 

This picture motivates the following improved precon- 
ditioner, w hich has been used in several previous treat- 
ments [3 |100| . Define the matrix Xq by keeping all 
matrix entries in the dense block corresponding to mul- 
tipoles {£,m) satisfying £ < fgpiit- Then consider the 
preconditioner 



and distinct beam transfer functions for each of the eight 
differencing assemblies in Q, V, and W-band. 

The slow convergence of this preconditioner is a bot- 
tleneck for our lens reconstruction analysis and has also 
been identified as a limiting factor in other contexts, e.g. 
Gibbs sampling [63|- Therefore, a faster method is 
desirable. 



2. Multigrid preconditioner 

So far, we have recalled existing work in the litera- 
ture: fast {S + N)~^ filtering can be performed via con- 
jugate gradient inversion with the block preconditioner 
(Eq. (|A5[) ). In this section, we present our improvement. 
The idea is that, even with the block preconditioner to 
do the inversion exactly at multipoles below ^spiit, conju- 
gate gradient inversion is still limited by the convergence 
rate at multipoles just above ^spiit (since the lowest mul- 
tipoles will have highest signal-to- noise) . However, these 
are precisely the multipoles which can be represented in 
a coarser pixelization. 

This leads naturally to a multigrid preconditioner: one 
preconditions the inversion at resolution iVgidc using the 
result of performing the inversion at coarser resolution 
A'sidc/Z, where the spherical transform is faster by a fac- 
tor of ^ 8. This process is recursive; the inversion at 
resolution iVsidc/2 is preconditioned by an inversion at 
resolution iVsido/4, and so on. At the coarsest resolution 
(typically A^sidc = 128), the inversion is preconditioned 
using the block preconditioner. For the WMAP exam- 
ple with parameters as described at the end of we 
find a running time of 14 CPU-minutes using the multi- 
grid preconditioner. This represents an improvement by 
a factor ^ 15, relative to the block preconditioner alone. 

In detail, the multigrid method works as follows. As 
described in the preceding section, we wish to compute 
X~^a, where a = aim is defined in harmonic space 
up to some maximum multipole £max, and X is de- 
fined by Eq. (jA4p . Then let ATjij be the matrix defined 
analagously, with all noise covariance matrices "coarsi- 
fied" (i.e. with A'sidc decreased by a factor of two), and 



with the maximum multipole reduced to some £, 



(1) 



< 



Then the multigrid preconditioner is defined by 







x; 



(A5) 



X 



obtained by keeping dense matrix entries below ^sput and 
the diagonal above ^piu. (In practice, the choice of ^put 
is usually dictated by memory limitations, since C'(^spiit) 
storage is needed to store Xq in dense form.) In this sec- 
tion, we will refer to (jASp as the "block preconditioner" . 

We have found that the block preconditioner is very 
efficient for the NVSS dataset, but slow to converge for 
WMAP. If we terminate the CG search as soon as we find 
an approximate solution a' ~ X~^a such that the termi- 
nation criterion \a — Xa'\/\a\ < 10^^ is satisfied, then 
block preconditioning requires 3.5 CPU-hours to con- 
verge for the three-year WMAP dataset with KpO mask. 



(1) " 

x; 



(A6) 



i.e. we use the diagonal preconditioner for multipoles 
above ^max- Since applying the preconditioner involves a 
multiplication by ^(i^j and the matrix X(^i-^ is too large 

for dense inversion, we do the X^^^^ multiplication recur- 
sively, using an "inner" instance of conjugate gradient 
inversion. The preconditioner for the inner CG inversion 
is obtained analagously by a second round of coarsify- 
ing noise covariance matrices and reducing the maximum 

multipole to some £max < ^mix, and so on. At the coars- 
est resolution, we use the block preconditioner described 
in the preceding subsection. 
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FIG. 20: Preconditioner chain for multigrid [S + N)~^ filtering, using noise maps from the three-year WMAP dataset. From 
left to right, each set of maps represents one conjugate gradient inversion problem, which is preconditioned by the "faster and 
cruder" approximation which appears next in the chain, obtained by either reducing resolution or the number of distinct beams 
retained in the problem. 



In Figure I20[ we show the preconditioner chain for 
WMAP. The parameters e(i) control the termi- 

nation criterion for each CG instance; when evaluating 
X7.^,, with preconditioner X7.}, we terminate the CG 

(4-1) (j) ; 

search after A'^(j) iterations, or when the approximate so- 
lution a' w X~^a satisfies \a — Xa'\/\a\ < €(^iy We have 
found that it is necessary to include these parameters to 
avoid spending too much CPU time in the coarse grids. 
In the WMAP3 example, the first level of preconditioning 
actually does not reduce the resolution, but instead re- 
duces the number of distinct beams in the problem from 
eight to two (by making the so-called "equal-beam ap- 
proximation" in which the average of the beam transfer 
functions is used). Note that the final output of the inver- 
sion does not make the equal-beam approximation, but 
merely uses inversions with the equal-beam approxima- 
tion internally, to precondition the top-level CG inversion 
where no such approximation is made. 

It is illuminating to describe the sequence of coarsi- 
fying and decoarsifying operations which occur in the 
multigrid method. Each iteration of the top-level CG 
loop requires one evaluation of its preconditioner, which 
in turn is a full-blown CG search (at coarser resolution) 
which can iterate up to 7V(i) times. Each of these iter- 
ations can iterate at the next coarsest resolution up to 
A'^(2) times, and so on. In the parlance of multigrid algo- 
rithms, this exponential fanout is referred to as a W-cycle 
(Figure [nj) . Note that, even though the number of it- 
erations spent at each resolution increases exponentially, 
the total CPU time does not, because the running time 
of each iteration is exponentially supressed; in each level, 
the resolution and value of £niax are typically reduced 
by a factor of two, which reduces the cost of the spheri- 
cal harmonic transform by a factor of eight. Indeed, the 
strength of the multigrid method is that it spends an ex- 




FIG. 21; Sequence of coarsifying and decoarsifying opera- 
tions in an instance of the multigrid method with A^(i) = 3, 
A''(2) = 2, showing the W-cycle structure. Each solid cir- 
cle represents one "forward" operation of the operator X — 
(1 + S^^^ S^^^) at the appropriate resolution. 



ponentially large number of CG iterations on the large 
angular scales, which are slowest to converge but accu- 
rately approximated at coarse resolution, while avoiding 
a large increase in CPU time. 

The performance of the multigrid preconditioner (~ 14 
CPU-min per Monte Carlo WMAP simulation) is suffi- 
cient for purposes of this paper. However, we have also 
found that none of the preconditioners described so far 
give reasonable performance with a realistic sky cut and 
the noise levels and resolution expected for the Planck 
satellite mission. Therefore, the multigrid preconditioner 
is probably not the final word on this subject; additional 
improvements are still needed for future datasets. 
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3. Template marginalization 

So far, we have assumed a noise covariance N^^^ for 
each map which is diagonal in pixel space. Suppose that, 
in addition, one wants to marginalize the amplitudes of 
-^tmpi modes in the map. We have seen several examples 
in the paper: 

1. In both WMAP and NVSS, we marginalize the 
monopole and dipole (iVtmpi — 4). 

2. In NVSS, we remove systematic declination gradi- 
ents by marginalizing any mode which is constant 
around each isolatitude ring in equatorial coordi- 
nates (^lY}. This leads to iVtmpi = -^ring, where 
Airing is the number of isolatitude rings in the pix- 
elization. 

3. In WMAP, one could use this formalism to 
marginalize any signal proportional to external 
foreground templates, although we have not im- 
plemented this because the effect of Galactic fore- 
grounds is small ( WI C[) . 

Template marginalization, in this general form, is easy 
to incorporate in our conjugate gradient framework. Let 
T be an A'tmpi-by-A'^pix matrix containing the templates. 
By the Woodburry formula, template marginalization 
modifies the map covariance as follows: 

(ATP'") - 1 - ( A^P'") - [TiNf'"") - 1 r^] - V(ArP'") - 1 ( A7) 

Since the conjugate gradient method only requires a 
"black box" procedure for multiplying a map by the in- 
verse covariance (A'|'"^)~^, one simply includes the extra 
term in Eq. (K7\ . 

If A^tmpi is small (e.g. in the case of marginalizing the 
monopole and dipole), one can simply keep the matrix r 
in dense form. In cases where Atmpi is large, all that is 
needed is a procedure for multiplying a map by the ma- 
trix T, i.e. computing each template amplitude given a 
map. For example, when marginalizing declination gra- 
dients in NVSS, we implement "multiplication by r" by 
simply averaging pixel values around each isolatitude ring 
in the input map. 



APPENDIX B: THREE-POINT ESTIMATORS 

In Appendix [X] we have described in detail how the 
filtered CMB map aim and filtered galaxy map Ijim are 
computed in our pipeline. In order to completely de- 
scribe our implementation, there is one remaining loose 
end: in this appendix, we will give the details of how our 
quadratic reconstructions (f>ermtpem,^em are computed. 
We will also prove the statement, made throughout the 
paper, that our bandpower estimators C^^ ,Cf^ ,C^^ for 
lensing, curl null test, and point sources are optimal. Our 
proof will depend on the assumption of small deviations 
from Gaussianity, and we discuss the conditions under 
which this assumption applies. 



1. Quadratic reconstruction 

Here, we give the implementational details of how the 

quadratic reconstuctions (f>em,'4'eTn,sem are computed in 
our pipeline. There is a small subtlety because the recon- 
structions are defined by position space equations, e.g. 
(f)em is defined by: 

J2 ^imYirnix) = V"" {a{x)V a^ix)) (Bl) 

em 

but the maps aem,4'em are defined in harmonic 
space. (The quantities a{x),P{x) were defined in 
Eqs. (USD, (HH).)^ 

In principle, (j) can be evaluated as a brute force har- 
monic space sum: 

(flm = X! feiU^Cj^ '3mmlm2°'eimiai2m.2 (B2) 

where ji^i^e^ '^^s defined previously in Eq. ([7]), and we 
have introduced the notation 

^£,£.£3 drf ^/ (24 + l)(2^2 + l)(24 + l) 

\ / Y mi 7712 "13 / 

However, the harmonic-space sum has computational 
cost O(^max) ^^'^ SO we introducc an optimized position- 
space method. 

Multiplying Eq. (jBip on both sides by Yim{x)* and 
integrating over x, one obtains: 

4>im = j d^xV''{Y,^{x))*a{x)VaP{x) (B4) 

The integral can be done exactly using Gauss-Legendre 
quadrature in cos{9) and uniform quadrature in Lp. We 
evaluate the quantities a(x),V al3{x) on the isolatitude 
rings by using a fast spin-0 and spin-1 spherical transform 
respectively. The right-hand side of Eq. (|B4[) can then be 
evaluated using a fast spin-1 transform. This algorithm 
provides an exact evaluation of Eq. (jB4p with compu- 
tational cost 0{l'^^y^). We use an analagous method to 
evaluate the quadratic quantities 'ipem,'sem- 



2. Equivalence with the bispectrum 

As a preliminary step toward proving optimality, we 
show how the estimators C^^ , Cf^ , C^^ can be rewritten 
purely in terms of the associated bispectra. Through- 
out this paper, when we write a bispectrum bi-^i^i^, it is 
understood that £i,£2 are CMB multipoles and £3 is a 
galaxy multipole. 

We write the lensing estimator in the following form: 

^"^'^J^T. ^t' - {^Im)] *9e,n (B5) 
em 
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In Eq. (|B5p and throughout this appendix, Cf^ denotes 
the cross power spectrum we are interested in estimating 
(typically proportional to \/P over some band in £), not 
the fiducial spectrum. 

If we replace by the right-hand side of Eq. (jB2p . 
we obtain: 



_ -'- \ ' f r^TT r^(t>g r/i^ 



2«3 



(B6) 



X [a^imia£2m2 ~ (fl^imi1^2m2)]5'^3 



We can replace {ai^miae^mi) by C^^^^ j^^^^, where in this 
appendix we use the notation {C^)~^, (C^)~^ to distin- 
guish the inverse signal -I- noise covariances for the CMB 
and galaxy fields. Now comparing with the form of the 
bispectrum due to lensing (Eq. ([5])), this becomes: 



2t;3 

m2m3 



(B7) 



X \at.tmtat.2'm.2 — ^Iiml, 



£2m2Ji'*3™3 



We have now written the lensing estimator purely in 
terms of the lensing bispectrum h^-^^^^^ . A similar cal- 
culation shows that the^same is true for the curl and 
point source estimators C^^ , C^f : in both cases the esti- 
mator takes the form in Eq. (|B7[) . with bi^^i^i^ replaced 
by the bispectrum due to lensing by a curl component, 
or the point source bispectrum in Eq. ([5^ . This allows 
us to give a uniform proof of optimality which applies to 
all three cases, as we will now see. 



3. Optimality 

We will now prove the following general statement: for 
any bispectrum bi-^i^i^, the optimal estimator is given by 



(B8) 



where the three-point and one-point terms are defined by 

timi 

2 51 ^^i^2^3^"H™2m3CJim!j2m2ff^«3'n3(B10) 

and F is the Fisher matrix element 



dcf 1 X > , , /^^1^2^3 /^£4£5^6 

= 2 2^ 0<?l<?2<!3 0<?4<?5<?6t^mim2m3y^ 



T-l /T^11^ 



(This expression generalizes the Fisher matrix for all sky 
isotropic noise previously considered in Eq. (|43p to an 
arbitrary noise covariance.) Note that we have computed 
the normalization explicitly; a short calculation shows 



that the estimator in Eq. (|B8|) has unit response to the 
bispectrum bi-^e^isj that the estimator is normalized 
and does not need the 1/J\f prefactor. 

The proof will depend on the assumption of weak non- 
Gaussianity; specifically we will assume that the fields are 
sufficiently close to Gaussian that the estimator variance 
can be approximated by its Gaussian contribution. 

First, we can show using the Cramer-Rao inequality 
that any unbiased estimator £ has variance > 1/F, where 
F is the Fisher matrix i n Eg . (|B11[) . This is proved us- 
ing the method of [63, llOlj . expanding the likelihood 
function for aim, gem around its Gaussian limit using the 
Edgeworth expansion. 

Now consider the variance Var(C'). We are assum- 
ing that this variance may be calculated using Gaussian 
statistics, so that Wick's theorem gives: 

Var(C3,C3) = F + fiC^r'f (B12) 
Cov(C3,Ci) - Cov(Ci,Ci) =/^(CS)-V 

where we have defined 

/fm = 2 X! ^''i''2^^mfm2mC^ml,£2m2 (^1^) 
lirrii 

Putting this together, we get Var(C) — l/F, i.e. the 
Cramer-Rao inequality is saturated. This completes our 
proof that the estimator is optimal, under the assumption 
of weak non-Gaussianity. 

When is this assumption satisfied for lensing? Roughly 
speaking, weak non-Gaussianity starts to break down 
when the instrumental sensitivity becomes good enough 
that a high signal-to-noise detection of CMB lensing can 
be achieved. More precisely, consider the case of full sky 
coverage and isotropic noise. This allows us to make con- 
tact with the results of [56| , where an unbiased estimator 
(j)im is defined for each multipole of the lensing potential, 
with full-sky noise power spectrum Nf'^ given previously 
in Eq. In this notation, one can show that our filtered 

field (pim is equal to {Nf'^)~^(l)i„i, and our estimator is 
given by: 




4g I Vtrn 



ge-n 



(B14) 



The first improvement that can be made to this estimator 
is to make the replacement 




(B15) 



to incorporate the nonzero sample variance of the lenses. 
Our estimator C is optimized assuming Gaussian covari- 
ance among modes of the CMB, and does not "know" 
that there is extra sample variance hidden in the prob- 
lem. However, it is unclear how to generalize C to the 
case of sky cuts and inhomogeneous noise, as we have 
done for C, allowing an arbitrary noise covariance matrix 
TV. 
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The estimators C, C agree when Cj <C A'/"*', i.e. when 
the reconstruction noise in the lensing potential domi- 
nates the signal, considered one mode of the potential at 
a time. This condition holds for WMAP, as can be seen 
from the direct comparison in Fig. [21 left panel. However, 
the estimator C which we have constructed would start to 
become suboptimal for future surveys with sufficient sen- 
sitivity to reconstruct the lensing potential with signal- 
to-noise ~ 1 per mode. For even more futuristic sensitiv- 
ity levels, even the improved estimator C would become 
suboptimal; the three-point estimator could be improved 
by using a maximum likelihood formalism which incorpo- 
rates information from higher-point correlation functions 
of aU orders [STj . 

In addition to these optimality issues for future sur- 
veys, there are other ways in which our estimator might 
be extended. First, we have not considered CMB po- 
larization, which is ultimately expected t o pro vide more 
sensitivity to lensing than temperature [102l |. Second, 
by using full-blown filtering, we have ensured opti- 
mality of the estimator, but it would be interesting to 
determine whether a simpler filter can be found which 
achieves near-optimal power spectrum uncertainties. As 
we have remarked in Appendix \^ the C^^ operation 
seems prohibitively expensive for Planck with existing 
preconditioners, so finding such a filter may be a practi- 
cal necessity for future experiments. 



the expectation value {■)t,g.m on both sides, we obtain: 



ct 



T.,G,M 



{<l>{T,M)}„,)Jg{G,M),rn), 



-{4>{T\M)*J)^,{g{G,M),^)^ 







/ M 

(C3) 



In the first line, we have used the fact that in the absence 
of lensing, the CMB realization T is independent of the 
galaxy realization G once the mask M has been speci- 
fied, to bring the expectation value (•)t inside the sum. 
This completes the proof that the expectation value in 
Eq. (|C1|) vanishes in the absence of CMB lensing, i.e. 
mask-galaxy correlations cannot fake the lensing signal. 

It is interesting to note that this proof would break 
down if the one-point term were omitted from the lensing 
estimator Ct^ . In this case, we would obtain 



b IT,g,M 



'MT,M)Ut{9{G,M),„,)c 



/ M 

which cannot be simplified further: the map {g{G, M))q 
can be nonzero if there are mask-galaxy correlations, and 

the map (4){T, M))t is generally nonzero in the presence 
of a mask. 



APPENDIX C: RESOLVED POINT SOURCES 



APPENDIX D: BEAM ASYMMETRY 



We give a proof of a statement made in WII CI cor- 
relations between the mask and the galaxy field cannot 
fake the lensing signal, i.e. the expectation value 



T,G,M 



(CI) 



in the absence of CMB lensing. We have introduced the 
notation {•)t,g,m to denote an expectation value taken 
over realizations of the CMB T, galaxy counts G, and 
mask M (where the last two are assumed correlated). 

In the proof, we will denote the quadratic reconstruc- 
tion which appears in the lensing estimator by (/)(r, M) 
to emphasize that it depends on both the CMB realiza- 
tion T and the mask M. We will analagously denote the 
filtered galaxy field by g{G,M). In this notation, the 
lensing estimator can be written: 



^t' = E - (-^(^^ * 9{G, M),„ (C2) 

^ — ' L J ini 



where we have written the one-point term as an average 
over CMB realization T' with the mask M fixed. Taking 



To include beam asymmetry in our simulation pipeline, 
we need an expression for the beam-convolved tempera- 
ture T(x) in each pixel x, in terms of three quantities: 
the beam profile, the scan strategy, and the unconvolved 
CMB T{x). 

We represent the beam profile in real space as G{9, (p) 
or in harmonic space as: 



Gie,^) = Y,9esYtsie,ip) 



(Dl) 



Is 



Following f46l. Appendix B], the scan strategy will be 
represented by the following quantity: 



w{x, a) = 27r 



E,g^ - «») 



(D2) 



where the angle a parameterizes beam orientations at the 
pixel x, relative to an arbitrarily chosen reference angle. 
The sum in Eq. (|D2p runs over timestream samples i 
which fall in pixel x with beam orientation a^. Note that 
w{x,a) depends on the choice of reference direction, or 
local frame at x. 

We briefi y rec all the theory of spin-s fields; for more 
details see [103j . A spin-s field (— oo < s < oo) is a 
function (gf) whose value at x depends on a choice of 



25 



frame, or pair of orthonormal basis vectors {ei, B2} at x. 
Under the right-handed rotation 



e'l = (cos0)ei + (sin6')e2 
e'j = —{sm9)ei + {cos9)e2 



(D3) 



(s/) must transform as (s/)' = e^**''(s/). One can de- 
fine spin-s spherical harmonics gYemiO,^)', these are an 
orthonormal set of basis functions for spin-s fields, with 
properties that are similar to the ordinary (spin-0) spher- 
ical harmonics Y^m- 

If we Fourier transform the frame-dependent quantity 
w{x, a) in the angle a: 



w{x, a) 



(,«;(a;))*e'^" 



(D4) 



then sw{x) will be a spin-s field as suggested by the no- 
tation. 

Now we can write an expression for the beam- 
convolved CMB temperature T{x): 

f{x)^ J d?x'T{x') J ^w{x,a)2pGi0^^,,-a) (D5) 

where O^x' denotes the angle between points x, x' , and the 
subscript "2P" on any frame-dependent quantity (such as 
w{x,a)) indicates the "two-point" frame: the reference 
directi on e i at x points toward x' . 

Eq. (|D5[) simply states that the beam-convolved tem- 
perature at X is given by averaging over scan directions 
a, with the beam profile rotated through angle a be- 
fore it is applied. To simplify this expression, we plug in 
Eqs. dm]), dnH), obtaining: 



T{x)^ j £xT{x)Y,{sw{x)2vygisY,s{Sxx',Q) (D6) 



Now use the identity 



to obtain 



An 



21- 



-J2isYi^{x)hpY;^ix') (D7) 



An 
2i+l 



isw{x))*gesaimisYim{x)) (D8) 



This is our desired expression for T. The final result is a 
spin-0 quantity, so we have dropped the 2P. 



Eq. (jPSp is a sum over beam multipoles s multiplied 
by the spin-s component of the scan strategy. Note that 
the spin-0 component (ou'(x)) is equal to 1 by construc- 
tion (Eq. (|D2])), so that the s = term in Eq. fPSl) 
does not depend on the scan strategy and is simply 
given by convolving {aem} with the beam transfer func- 
tion = An 1(21 -I- l)g£o- The higher-spin terms do 
depend on the scan strategy and represent corrections 
to the symmetric-beam approximation. If the beam is 
azimuthally symmetric [gis — for s > 0), or the 
beam is arbitrary but the scan is isotropic in each pixel 
(sw(a;) = for s > 0), then the higher spin terms do not 
contribute and the symmetric-beam approximation is ex- 
act. For WMAP, we find that the sum over s in Eq. jDSj 
converges rapidly so that truncating at Ssmax = 16 fully 
incorporates beam asymmetry. 

A fast algorithm for evaluating Eq. (jPSP may be given 
by noting that each term in the s sum is simply a spin-s 
spherical transform. In an isolatitude coordinate system, 
a spin-s transform may be performed with computational 
cost O(^max) using the recursion 



sYl-Y ,m Pl~l,m{sY£—2,m) 

(D9) 



XsYim) =[Z + 



l{l - 1) 



on each isolatitude ring, where we have defined p^^ 
^(^2 - rrfi){P - s2)/(4^2 _ Thus the total com- 

putational cost of incorporating beam asymmetry via 
Eq. jDSj is 0{s^ 

Finally, we include a detail which is specific to WMAP. 
The preceding treatment has assumed that there is one 
beam gts and one scan sw{x) for each simulated map. 
In WMAP, we have one simulated map per differencing 
assembly, obtained as the difference of A-side and B-side 
measurements. In this case, one makes the replacement 

{sw{x)rg,, -> {,w^{x)rgi + {sw"" {x))* gf, (DIO) 

in Eq. (|D8p . where gtm^dFrn ^^'^ A-side and B-side 
beams, and , are defined by 



Af N dcf „ Eaea;'^("~"a 

w (x,a) = 2n 



dof 



w {x,(x) — 2t: 



(Dll) 
(D12) 



where Eae^'Ebea; denote sums over A-side and B-side 
timestream samples which fall in pixel x. 
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